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PYK2 CRYSTAL STRUCTURE AND USES 



CROSS-REFERENCE TO RELATED PATENT APPLICATIONS 

[0001] This application claims the benefit of Ibrahim et al., U.S. Provisional Application 
60/451,101, filed February 28, 2003, which is incorporated herein by reference in its 
entirety, including drawings. 

BACKGROUND OF THE INVENTION 

[0002] This invention relates to the field of development of ligands for protein tyrosine 
kinase 2 (PYK2) and to the use of crystal structures of PYK2. The information provided 
is intended solely to assist the understanding of the reader. None of the information 
provided nor references cited is admitted to be prior art to the present invention. 

[0003] Cellular signal transduction is a fundamental mechanism whereby external 
stimuli that regulate diverse cellular processes are relayed to the interior of cells. One of 
the key biochemical mechanisms of signal transduction involves the reversible 
phosphorylation of tyrosine residues on proteins. The phosphorylation state of a protein is 
modified through the reciprocal actions of tyrosine phosphatases (TPs) and tyrosine 
kinases (TKs), including receptor tyrosine kinases and non-receptor tyrosine kinases. 

[0004] Receptor tyrosine kinases (RTKs) belong to a family of transmembrane proteins 
and have been implicated in cellular signaling pathways. The predominant biological 
activity of some RTKs is the stimulation of cell growth and proliferation, while other 
RTKs are involved in arresting growth and promoting differentiation. In some instances, a 
single tyrosine kinase can inhibit, or stimulate, cell proliferation depending on the cellular 
environment in which it is expressed. 

[0005] RTKs are composed of at least three domains: an extra-cellular ligand binding 
domain, a transmembrane domain and a cytoplasmic catalytic domain that can 
phosphorylate tyrosine residues. Ligand binding to membrane-bound receptors induces 
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the formation of receptor dimers and allosteric changes that activate the intracellular 
kinase domains and result in the self-phosphorylation (autophosphorylation and/or 
transphosphorylation) of the receptor on tyrosine residues. Individual phosphotyrosine 
residues of the cytoplasmic domains of receptors may serve as specific binding sites that 
interact with a host of cyto-plasmic signaling molecules, thereby activating various signal 
transduction pathways. 

[0006] The intracellular, cytoplasmic, non-receptor protein tyrosine kinases do not 
contain a hydrophobic transmembrane domain or an extracellular domain and share non- 
catalytic domains in addition to sharing their catalytic kinase domains. Such non-catalytic 
domains include the SH2 domains and SH3 domains. The non-catalytic domains are 
thought to be important in the regulation of protein-protein interactions during signal 
transduction. 

[0007] A central feature of signal transduction is the reversible phosphorylation of 
certain proteins. Receptor phosphorylation stimulates a physical association of the 
activated receptor with target molecules, which either are or are not phosphorylated. 

[0008] Some of the target molecules such as phospholipase Cy are in turn 
phosphorylated and activated. Such phosphorylation transmits a signal to the cytoplasm. 
Other target molecules are not phosphorylated, but assist in signal transmission by acting 
as adapter molecules for secondary signal transducer proteins. For example, receptor 
phosphorylation and the subsequent allosteric changes in the receptor recruit the Grb- 
2/SOS complex to the catalytic domain of the receptor where its proximity to the 
membrane allows it to activate ras. 

[0009] The secondary signal transducer molecules generated by activated receptors 
result in a signal cascade that regulates cell functions such as cell division or 
differentiation. Reviews describing intracellular signal transduction include Aaronson, 
Science 254:1146-1153, 1991; Schlessinger, Trends Biochem. 5c/., 13:443-47, 1988; and 
Ullrich and Schlessinger, Cell, 61:203-212, 1990. 

[0010] Signal transduction pathways that regulate ion channels (e.g., potassium channels 
and calcium channels) involve G proteins which function as intermediaries between 
receptors and effectors. Gilman,^««. Rev. Biochem., 56:615-649 (1987); Brown and 
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Birnbaumer, Ann. Rev. Physiol., 52: 197-213 (1990). G-coupled protein receptors are 
receptors for neurotransmitters, ligands that are responsible for signal production in nerve 
cells as well as for regulation of proliferation and differentiation of nerves and other cell 
types. Neurotransmitter receptors exist as different subtypes which are differentially 
expressed in various tissues and neurotransmitters such as acetylcholine evoke responses 
throughout the central and peripheral nervous systems. 

[0011] The muscarinic acetylcholine receptors play important roles in a variety of 
complex neural activities such as learning, memory, arousal and motor and sensory 
modulation. These receptors have also been implicated in several central nervous system 
disorders such as Alzheimer's disease, Parkinson's disease, depression and schizophrenia. 

[0012] Some agents that are involved in a signal transduction pathway regulating one 
ion channel, for example a potassium channel, may also be involved in one or more other 
pathways regulating one or more other ion channels, for example a calcium channel. 
Dolphin, Ann. Rev. Physiol., 52:243-55 (1990); Wilk-Blaszczak et al., Neuron, 12: 109- 
116 (1994). Ion channels may be regulated either with or without a cytosolic second 
messenger. Hille, Neuron, 9:187-195 (1992). One possible cytosolic second messenger is 
a tyrosine kinase. Huang et al., Cell, 75:1 145-1 156 (1993), incorporated herein by 
reference in its entirety, including any drawings. 

[0013] The receptors involved in the signal transduction pathways that regulate ion 
channels are ultimately linked to the ion channels by various intermediate events and 
agents. For example, such events include an increase in intracellular calcium and inositol 
triphosphate and production of endothelin. Frucht, et al., Cancer Research, 52: 1 1 14-1 122 
(1992); Schrey, et al., Cancer Research, 52:1786-1790 (1992). Intermediary agents 
include bombesin, which stimulates DNA synthesis and the phosphorylation of a specific 
protein kinase C substrate. Rodriguez-Pena, et al., Biochemical and Biophysical Research 
Communication, 140(l):379-385 (1986); Fisher and Schonbrunn, J. Biol. Chem., 
263(6):2208-2816(1988). 

[0014] Focal adhesion kinase (FAK) is a cytoplasmic protein tyrosine kinase localized to 
focal adhesions that is known to associate with two Src family kinases. Schaller, et al., 
Proc. Natl. Acad. Sci. U.S.A., 89:5192-5196 (1992), incorporated herein by reference in its 
entirety, including any drawings; Cobb et al., Mol. Cell. Biol, 14(1): 147-1 55 (1994). The 
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proteins associated with the cytoplasmic surface of adhesion molecules are reviewed in 
Gumbiner, Neuron, 11:551-564 (1993). 

[0015] FAK may regulate interactions of integrins, agonist receptors, and/or stress 
fibers. Shattil et al., J. Biol Chervil 269(20): 14738-14745 (1994); Ridley and Hall, The 
EMBO Journal 13(1 1):2600-2610 (1994). FAK does not contain SH2 or SH3 domains 
and the amino acid sequence of FAK is highly conserved among birds, rodents and man. 

[0016] In some cells the C-terminal domain of FAK is expressed autonomously as a 41 
kDa protein called FRNK and the 140 C-terminal residues of FAK contain a focal 
adhesion targeting (FAT) domain. The cDNA's encoding FRNK are given in Schaller et 
al., Mol Cell Biol, 13(2) :785-791 (1993), incorporated herein by reference in its entirety, 
including any drawings. The FAT domain was identified and said to be required for 
localization of FAK to cellular focal adhesions in Hilderbrand et al., J. Cell Biol, 
123(4):993-1005(1993). 

[0017] The non-receptor tyrosine kinase, PYK2, is activated by binding of ligand to G- 
coupled protein receptors such as bradykinin and acetylcholine. PYK2 has a predicted 
molecular weight of 1 1 1 kD and contains five domains: (1) a relatively long N-terminal 
domain; (2) a kinase catalytic domain; (3) a proline rich domain; (4) another proline rich 
domain; and (5) a C-terminal focal adhesion targeting (FAT) domain. PYK2 does not 
contain a SH2 or SH3 domain. 

[0018] The FAT domain of PYK2 has 62% similarity to the FAT domain of another 
non-receptor tyrosine kinase, FAK, which is also activated by G-coupled proteins. The 
overall similarity between PYK2 and FAK is 52%. PYK2 is expressed principally in 
neural tissues, although expression can also be detected in hematopoietic cells at early 
stages of develop-ment and in some tumor cell lines. The expression of PYK2 does not 
correspond with the expression of FAK. 

[0019] PYK2 is also known as Cell Adhesion Kinase (3 (CAK p) and Related Adhesion 
Focal Tyrosine Kinase (RAFTK). Nucleotide and amino acid sequences for PYK2 are 
described in a set of related patents, including U.S. Patent 8,837,815; 5,837,524; and 
Patent Publication U.S. 2002/0048782, which also provided additional information on 
PYK2 and a related protein, FAK, including some of the information described below. 
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Each of these documents describes nucleotide and amino acid sequences for PYK2. 
Patent 5,837,524 describes a method of screening for agents "able to promote or disrupt 
the interaction" between "a PYK2 polypeptide and a natural binding partner (NBP)." (Col. 
8, lines 60-67.) Patent Publication U.S. 2002/0048782 provides examples describing 
cloning and the testing of certain properties of PYK2. Each of these patents and patent 
publication are incorporated by reference herein in their entireties, including drawings. 

[0020] PYK2 is believed to regulate the activity of potassium channels in response to 
neurotransmitter signalling. PYK2 enzymatic activity is positively regulated by 
phosphorylation on tyrosine and results in response to binding of bradykinin, TPA, 
calcium ionophore, carbachol, TPA+ forskolin, and membrane depolarization. The 
combination of toxins known to positively regulate G-coupled receptor signalling (such as 
pertusis toxin, cholera toxins, TPA and bradykinin) increases the phosphorylation of 
PYK2. Activated PYK2 phosphorylates RAK, a delayed rectifier type potassium channel, 
and thus suppresses RAK activity. In the same system, FAK does not phosphorylate 
RAK. 

[0021] Further, integrin-linked signaling is important for regulating cell adhesion and 
motility. (Hynes, R. (2002) Integrins: bidirectional, allosteric signaling machines. Cell, 
110, 673-687.) The FAK and PYK2 tyrosine kinases are key mediators of integrin- 
dependent signals. (Hauck et al (2000) Focal adhesion kinase functions as a receptor- 
proximal signaling component required for directed cell migration. Immunol Res, 21, 293- 
303.) Both FAK and PYK2 mediate cytoskeletal rearrangements as a consequence of 
integrin ligation. FAK, which localizes to focal adhesions, is activated by binding of cell- 
surface integrins to the extracellular matrix. In response to external stimuli, growth factors 
associate with integrins, and FAK also becomes phosphorylated in response to growth 
factors. (Sieg, et al. (2000) FAK integrates growth-factor and integrin signals to promote 
cell migration. Nat Cell Biol, 2, 249-256.) In addition to its role in regulating the 
cytoskeleton and cell movements, FAK also helps to coordinate these processes with 
growth signals and cellular survival. 

[0022] By contrast, PYK2 is localized to the sites of cell-cell contacts, and becomes 
activated in response to calcium mobilization. (Lev, et al. (1995) Protein tyrosine kinase 
PYK2 involved in Ca(2+)-induced regulation of ion channel and MAP kinase functions. 
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Nature, 376, 737-745.) Indeed, whereas FAK appears to mediate cellular survival, PYK2 
activation leads to apoptosis in fibroblasts. (Xiong, W. and Parsons, J.T. (1997) Induction 
of apoptosis after expression of PYK2, a tyrosine kinase structurally related to focal 
adhesion kinase. J Cell Biol, 139, 529-539.) In monocytes and osteoclasts, PYK2 localizes 
to the podosome, a cellular protrusion that contacts the extracellular matrix and mediates 
adhesion and motility in these cell types. (Duong et ah (1998) PYK2 in osteoclasts is an 
adhesion kinase, localized in the sealing zone, activated by ligation of alpha(v)beta3 
integrin, and phosphorylated by src kinase. J Clin Invest, 102, 881-892; Lakkakorpi et al. 
(1999) Stable association of PYK2 and pl30(Cas) in osteoclasts and their co-localization 
in the sealing zone. J Biol Chem, 274, 4900-4907.) 

[0023] In spite of the different biological functions, FAK and P YK2 are the only 
members of the FAK family of tyrosine kinases, and they share 45% sequence identity 
overall, with higher homology in the kinase catalytic domain (60%). (Lev et al. (1995) 
Nature, 376, 737-745; Sasaki et al. (1995) Cloning and characterization of cell adhesion 
kinase beta, a novel protein-tyrosine kinase of the focal adhesion kinase subfamily . J Biol 
Chem, 270, 21206-21219.) Furthermore, most of the key regulatory sites are highly 
conserved. In the N-terminus is a large integrin-binding domain. In the C-terminus is the 
so-called FAT (focal adhesion targeting) domain that mediates subcellular localization via 
binding sites for the cytoskeleton-associated proteins paxillin and talin. The kinase 
catalytic domain is in the center of the proteins. In addition, proline-rich regions in the C- 
terminus serve to bind to the SH3 domains of the adaptor proteins CAS and GRAF. 
(Hildebrand et al. (1996) An SH3 domain-containing GTPase-activating protein for Rho 
and Cdc42 associates with focal adhesion kinase. Mol Cell Biol, 16, 3169-3178; Polte, 
T.R. and Hanks, S.K. (1995) Interaction between focal adhesion kinase and Crk-associated 
tyrosine kinase substrate pl30Cas. Proc Natl Acad Sci USA, 92, 10678-10682.) 

[0024] The primary autophosphorylation site (Y397 in FAK, Y402 in PYK2, just 
upstream of the catalytic domain) serves as a binding site for the SH2 domain of a Src- 
family tyrosine kinase. (Dikic et al. (1996) A role for Pyk2 and Src in linking G-protein- 
coupled receptors with MAP kinase activation. Nature, 383, 547-550.) This site is also a 
substrate for the Src kinase. Additional tyrosine phosphorylation events occur at residues 
within the catalytic domain (Y576, Y577 in FAK, Y579, Y580 in PYK2) whose function 
is unclear, and at a C-terminal site (Y925 in FAK, Y881 in PYK2) that serves as binding 
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site for the SH2 domain of GRB2. (Schlaepfer et al. (1999) Signaling through focal 
adhesion kinase. Prog Biophys Mol Biol, 71, 435-478.) In addition to assembling a variety 
of proteins, FAK and PYK2 also play important roles by phosphorylating key substrates 
such as paxillin and CAS. (Bellis et al. (1995) Characterization of tyrosine 
phosphorylation of paxillin in vitro by focal adhesion kinase. J Biol Chem, 270, 17437- 
17441; Li, X. and Earp, H.S. (1997) Paxillin is tyrosine-phosphorylated by and 
preferentially associates with the calcium-dependent tyrosine kinase in rat liver epithelial 
cells. J Biol Chem, 272, 14341-14348.) Tyrosine phosphorylation of paxillin and CAS 
creates a new binding site for SH2 adaptor proteins. For example, paxillin binds to and is 
phosphorylated by PYK2 in hematopoietic cells. (McShan et al. (2002) Csk homologous 
kinase associates with RAFTK/Pyk2 in breast cancer cells and negatively regulates its 
activation and breast cancer cell migration. Internat. J. Oncology 21:197-205.) 

[0025] Furthermore, expression of PYK2 and FAK was observed in breast cancer cells, 
and it was reported that PYK2 participates in intracellular signaling upon heregulin (HRG) 
stimulation and promotes breast carcinoma invasion. CHK acted as a negative regulator 
of PYK2, significantly reducing the migration of PYK2 expressing breast cancer cells. 
(McShan et al (2002) Internat. J. Oncology 21: 197-205.) 

[0026] Methods of identifying a compound that binds to and/or modulates the activity of 
PYK2 are described in Duong et al., PCT/US98/02797, WO 98/35056, where the method 
involves contacting the compound and PYK2 and determining if binding has occurred. If 
binding has occurred, the activity of the bound PYK2 can be compared to the activity of 
P YK2 which is not bound to the compound to determine if the compound modulates 
PYK2 activity, (p.2, lines 9-15) The compounds identified are indicated to be useful in 
the prevention or teatment of osteoporosis, inflammation, and other conditions dependent 
on monocyte migration and invasion activities, (p.3, lines 1-5) This application is hereby 
incorporated by reference in its entirety. 

SUMMARY OF THE INVENTION 

[0027] The present invention concerns structural information about PYK2 kinase, 
crystals of PYK2 kinases with and without binding compounds, and the use of the PYK2 
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kinase crystals and structural information about the PYK2 kinase to develop PYK2 
ligands, e.g., inhibitors. 

[0028] Thus, in a first aspect, the invention concerns a method for determining the 
orientation of compounds that bind to PYK2 and/or identifying binding compounds by 
determining the orientation of at least one compound bound to PYK2 in co-crystals of 
PYK2 with binding compound. The method also characterizes the binding of a PYK2 
binding compound bound to PYK2. In particular embodiments, the method can also 
involve one or more of: identifying as molecular scaffolds one or more compounds that 
bind weakly (with low or very low affinity) to a binding site of PYK2 kinase and have 
molecular weight less than 350 daltons; determining activity of the compounds or 
molecular scaffolds against PYK2 (activity can also be determined against 1, 2, 3, or more 
additional kinases; scaffolds preferably have low activity); determining the orientation of 
at least one molecular scaffold in co-crystals with PYK2 kinase; identifying chemical 
structures of one or more of the molecular scaffolds that, when modified, alter the binding 
affinity or binding specificity or both between the molecular scaffold and the PYK2 
kinase; synthesizing or otherwise obtaining a ligand in which one or more of the chemical 
structures of the molecular scaffold is modified to provide a ligand that binds to the PYK2 
kinase with altered binding affinity or binding specificity or both. Thus, the invention 
provides a method for identifying or developing PYK2 ligands, e.g., by identifying 
derivatives of PYK2 binding compounds, which may be molecular scaffolds, that have 
greater affinity and/or greater specificity for PYK2 than the parent compound. For 
example, the method can involve determining the binding orientation, identifying one or 
more chemical structures of one or more compounds that, when modified, alter the binding 
affinity and/or specificity; and synthesizing or otherwise obtaining a ligand in which one 
or more of those chemical structures is modified to provide a ligand that binds to PYK2 
kinase with altered binding affinity or binding specificity or both. The method can also 
include identifying a molecular scaffold that binds to PYK2. Highly preferably the 
modified compound (ligand) also has altered activity (i.e., altered effect on the activity of 
PYK2 kinase). 



[0029] The terms "PYK2 kinase" and "PYK2" mean an enzymatically active kinase that 
contains a portion at least 50 amino acid residues in length with greater than 90% amino 
acid sequence identity to at least a portion of PYK2 kinase domain (SEQ ID NO.: 1), for a 
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maximal alignment over an equal length segment; or that contains a portion with greater 
than 90% amino acid sequence identity to SEQ ID NO.: 1 that retains binding to ATP. 
Preferably the sequence identity is at least 95, 97, 98, 99, or even 100% with SEQ ID NO. 
1 . Preferably the identity is over a portion of SEQ ID NO: 1 that is at least 100, 150, 200, 
250, or 272 amino acid in length. 

[0030] The term "PYK2 kinase domain" refers to a reduced length PYK2 {i.e., shorter 
than a full-length PYK2 by at least 100 amino acids at each of the N-terminus and the C- 
terminus) that includes the kinase catalytic region in PYK2, which is located near the 
center of the full-length molecule. Highly preferably for use in this invention, the kinase 
domain retains kinase activity, preferably at least 50% the level of kinase activity as 
compared to the native PYK2, more preferably at least 60, 70, 80, 90, or 100% of the 
native activity in a competitive kinase assay with ATP as a substrate and ATPyS as 
competitive inhibitor. An example is the PYK2 kinase domain of SEQ ID NO: 1 . 

[0031] As used herein, the terms "ligand" and "modulator" are used equivalently to 
refer to a compound that modulates the activity of a target biomolecule, e.g., an enzyme 
such as a kinase. Generally a ligand or modulator will be a small molecule, where "small 
molecule refers to a compound with a molecular weight of 1500 daltons or less, or 
preferably 1000 daltons or less, 800 daltons or less, or 600 daltons or less. Thus, an 
"improved ligand" is one that possesses better pharmacological and/or pharmacokinetic 
properties than a reference compound, where "better" can be defined by a person for a 
particular biological system or therapeutic use. In terms of the development of ligands 
from scaffolds, a ligand is a derivative of a scaffold. 

[0032] In the context of binding compounds, molecular scaffolds, and ligands, the term 
"derivative" or "derivative compound" refers to a compound having a chemical structure 
that contains a common core chemical structure as a parent or reference compound, but 
differs by having at least one structural difference, e.g., by having one or more substituents 
added and/or removed and/or substituted, and/or by having one or more atoms substituted 
with different atoms. Unless clearly indicated to the contrary, the term "derivative" does 
not mean that the derivative is synthesized using the parent compound as a starting 
material or as an intermediate, although in some cases, the derivative may be synthesized 
from the parent. 
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[0033] Thus, the term "parent compound" refers to a reference compound for another 
compound, having structural features continued in the derivative compound. Often but not 
always, a parent compound has a simpler chemical structure than the derivative. 

[0034] By "chemical structure" or "chemical substructure" is meant any definable atom 
or group of atoms that constitute a part of a molecule. Normally, chemical substructures 
of a scaffold or ligand can have a role in binding of the scaffold or ligand to a target 
molecule, or can influence the three-dimensional shape, electrostatic charge, and/or 
conformational properties of the scaffold or ligand. 

[0035] The term "binds" in connection with the interaction between a target and a 
potential binding compound indicates that the potential binding compound preferentially 
associates with the target to a statistically significant degree as compared to association 
with proteins generally (i.e., non-specific binding). Thus, the term "binding compound" 
refers to a compound that has such a statistically significant association with a target 
molecule. Preferably a binding compound interacts with a specified target with a 
dissociation constant (k<i) of 1 mM or less. A binding compound can bind with "low 
affinity", "very low affinity", "extremely low affinity", "moderate affinity", "moderately 
high affinity", or "high affinity" as described herein. 

[0036] In the context of compounds binding to a target, the term "greater affinity" 
indicates that the compound binds more tightly than a reference compound, or than the 
same compound in a reference condition, i.e., with a lower dissociation constant. In 
particular embodiments, the greater affinity is at least 2, 3, 4, 5, 8, 10, 50, 100, 200, 400, 
500, 1000, or 10,000-fold greater affinity. 

[0037] Also in the context of compounds binding to a biomolecular target, the term 
"greater specificity" indicates that a compound binds to a specified target to a greater 
extent than to another biomolecule or biomolecules that may be present under relevant 
binding conditions, where binding to such other biomolecules produces a different 
biological activity than binding to the specified target. Typically, the specificity is with 
reference to a limited set of other biomolecules, e.g., in the case of PYK2, other kinases or 
even other type of enzymes. In particular embodiments, the greater specificity is at least 
2, 3, 4, 5, 8, 10, 50, 100, 200, 400, 500, or 1000-fold greater specificity. 
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[0038] As used in connection with binding of a compound with PYK2, the term 
"interact" indicates that the distance from a bound compound to a particular amino acid 
residue will be 5.0 angstroms or less, or 6 angstroms or less with one water molecule 
coordinated between the compound and the residue, or 9 angstroms or less with two water 
molecules coordinated between the compound and the residue. In particular embodiments, 
the distance from the compound to the particular amino acid residue is 4.5 angstroms or 
less, 4.0 angstroms or less, or 3.5 angstroms or less. Such distances can be determined, for 
example, using co-crystallography, or estimated using computer fitting of a compound in a 
PYK2 active site. 

[0039] Reference to particular amino acid residues in PYK2 polypeptide residue number 
is defined by the numbering provided in Lev et al. (1995) "Protein tyrosine kinase PYK2 
involved in Ca(2+)-induced regulation of ion channel and MAP kinase functions" Nature 
376:737-745. 

[0040] In a related aspect, the invention provides a method for developing ligands 
specific for PYK2 kinase, where the method involves determining whether a derivative of 
a compound that binds to a plurality of kinases has greater specificity for the P YK2 kinase 
than the parent compound with respect to other kinases. In particular embodiments, the 
method also involves identifying such a compound that binds to a plurality of kinases. 

[0041] As used herein in connection with binding compounds or ligands, the term 
"specific for PYK2 kinase", "specific for PYK2" and terms of like import mean that a 
particular compound binds to the particular PYK2 kinase to a statistically greater extent 
than to other kinases that may be present in a particular organism. Also, where biological 
activity other than binding is indicated, the term "specific for a PYK2 kinase" indicates 
that a particular compound has greater biological activity associated with binding PYK2 
than to other kinases. Preferably, the specificity is also with respect to other biomolecules 
(not limited to kinases) that may be present from an organism. 

[0042] In another aspect, the invention provides a method for obtaining improved 
ligands binding to PYK2, where the method involves identifying a compound that binds to 
PYK2, determining whether that compound interacts with one or more of PYK2 residues 
503, 505, 457, 488, 567, and 554, and determining whether a derivative of that compound 
binds to the PYK2 kinase with greater affinity or greater specificity or both than the parent 
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binding compound. Binding with greater affinity or greater specificity or both than the 
parent compound indicates that the derivative is an improved ligand. This process can 
also be carried out in successive rounds of selection and derivatization and/or with 
multiple parent compounds to provide a compound or compounds with improved ligand 
characteristics. Likewise, the derivative compounds can be tested and selected to give 
high selectivity for the PYK2 kinase, or to give cross-reactivity to a particular set of 
targets, for example to a subset of kinases that includes PYK2. Certain compounds 
interact with the specified residues as 503, 505 (direct interacting), 457, 488, 567 (interact 
through 1 water), and 554 (interact through 2 waters). In particular embodiments, a 
molecular scaffold, binding compound, or ligand interacts with at least residues 503 and 
505; residues 503 and 505 and at least one of residues 457, 488, and 567; at least residues 
503, 505, 457, 488, and 567. 

[0043] By "molecular scaffold" or "scaffold" is meant a simple target binding molecule 
to which one or more additional chemical moieties can be covalently attached, modified, 
or eliminated to form a plurality of molecules with common structural elements. The 
moieties can include, but are not limited to, a halogen atom, a hydroxyl group, a methyl 
group, a nitro group, a carboxyl group, or any other type of molecular group including, but 
not limited to, those recited in this application. Molecular scaffolds bind to at least one 
target molecule, preferably to a plurality of molecules in a target family, e.g., a protein 
family. Preferred target molecules include enzymes and receptors, as well as other 
proteins. Preferred characteristics of a scaffold can include binding at a target molecule 
binding site such that one or more substituents on the scaffold are situated in binding 
pockets in the target molecule binding site; having chemically tractable structures that can 
be chemically modified, particularly by synthetic reactions, e.g., so that a combinatorial 
library can be easily constructed; having chemical positions where moieties can be 
attached that do not interfere with binding of the scaffold to a protein binding site, such 
that the scaffold or library members can be modified to form ligands, to achieve additional 
desirable characteristics, e.g., enabling the ligand to be actively transported into cells 
and/or to specific organs, or enabling the ligand to be attached to a chromatography 
column for additional analysis. Thus, a molecular scaffold is an identified target binding 
molecule prior to modification to improve binding affinity and/or specificity, or other 
pharmacalogic properties. 
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[0044] The term "scaffold core" refers to the core structure of a molecular scaffold onto 
which various substituents can be attached. Thus, for a number of scaffold molecules of a 
particular chemical class, the scaffold core is common to all the scaffold molecules. In 
many cases, the scaffold core will consist of or include one or more ring structures. 

[0045] By "binding site" is meant an area of a target molecule to which a ligand can 
bind non-covalently. Binding sites embody particular shapes and often contain multiple 
binding pockets present within the binding site. The particular shapes are often conserved 
within a class of molecules, such as a protein family. Binding sites within a class also can 
contain conserved structures such as, for example, chemical moieties, the presence of a 
binding pocket, and/or an electrostatic charge at the binding site or some portion of the 
binding site, all of which can influence the shape of the binding site. 

[0046] By "binding pocket" is meant a specific volume within a binding site. A binding 
pocket can often be a particular shape, indentation, or cavity in the binding site. Binding 
pockets can contain particular chemical groups or structures that are important in the non- 
covalent binding of another molecule such as, for example, groups that contribute to ionic, 
hydrogen bonding, or van der Waals interactions between the molecules. 

[0047] By "orientation", in reference to a binding compound bound to a target molecule 
is meant the spatial relationship of the binding compound (which can be defined by 
reference to at least some of its constituent atoms) to the binding site and/or atoms of the 
target molecule at least partially defining the binding site, typically including one or more 
binding pockets and/or atoms defining one or more binding pockets. 

[0048] In the context of target molecules in this invention, the term "crystal" refers to a 
regular assemblage of a target molecule of a type suitable for X-ray crystallography. That 
is, the assemblage produces an X-ray diffraction pattern when illuminated with a beam of 
X-rays. Thus, a crystal is distinguished from an agglomeration or other complex of target 
molecule that does not give a diffraction pattern. 

[0049] By "co-crystal" is meant a complex of the compound, molecular scaffold, or 
ligand bound non-covalently to the target molecule and present in a crystal form 
appropriate for analysis by X-ray or protein crystallography. In preferred embodiments 
the target molecule-ligand complex can be a protein-ligand complex. 
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[0050] The phrase "alter the binding affinity or binding specificity" refers to changing 
the binding constant of a first compound for another, and/or changing the level of binding 
of a first compound for a second compound as compared to the level of binding of the first 
compound for third compounds, respectively. For example, the binding specificity of a 
compound for a particular protein is increased if the relative level of binding to that 
particular protein is increased as compared to binding of the compound to unrelated 
proteins. 

[0051] As used herein in connection with test compounds, binding compounds, and 
modulators (ligands), the term "synthesizing" and like terms means chemical synthesis 
from one or more precursor materials. 

[0052] The phrase "chemical structure of the molecular scaffold is modified" means that 
a derivative molecule has a chemical structure that differs from that of the molecular 
scaffold but still contains common core chemical structural features. The phrase does not 
necessarily mean that the molecular scaffold is used as a precursor in the synthesis of the 
derivative. 

[0053] By "assaying" is meant the creation of experimental conditions and the gathering 
of data regarding a particular result of the experimental conditions. For example, enzymes 
can be assayed based on their ability to act upon a detectable substrate. A compound or 
ligand can be assayed, for example, based on its ability to bind to a particular target 
molecule or molecules. 

[0054] Certain compounds have been identified as molecular scaffolds and binding 
compounds for PYK2. Thus, in another aspect, the invention provides a method for 
identifying a ligand binding to PYK2, that includes determining whether a derivative 
compound that includes a core structure of Formula I as described herein binds to PYK2 
with altered binding affinity or specificity or both as compared to a parent compound. 

[0055] In reference to compounds of Formula I, the term "core structure" refers to the 
ring structure shown diagramatically as part of the description of compounds of Formula I, 
but excluding substituents. More generally, the term "core structure" refers to a 
characteristic chemical structure common to a set of compounds, especially a chemical 
structure than carries variable substituents in the compound set. 
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[0056] By a "set" of compounds is meant a collection of compounds. The compounds 
may or may not be structurally related. 

[0057] In another aspect, structural information about PYK2 can also be used to assist in 
determining a struture for another kinase, e.g., FAK, by creating a homology model from 
an electronic representation of a PYK2 structure. 

[0058] Typically creating such a homology model involves identifying conserved amino 
acid residues between P YK2 and the other kinase of interest; transferring the atomic 
coordinates of a plurality of conserved amino acids in the P YK2 structure to the 
corresponding amino acids of the other kinase to provide a rough structure of that kinase; 
and constructing structures representing the remainder of the other kinase using electronic 
representations of the structures of the remaining amino acid residues in the other kinase. 
In particular, coordinates from Table 1 or Table 2 for conserved residues can be used. 
Conserved residues in a binding site, e.g., PYK2 residues 503, 505, 457, 488, 567, and 
554, can be used. 

[0059] To assist in developing other portions of the kinase structure, the homology 
model can also utilize, or be fitted with, low resolution X-ray diffraction data from one or 
more crystals of the kinase, e.g., to assist in linking conserved residues and/or to better 
specify coordinates for terminal portions of a polypeptide. 

[0060] The PYK2 structural information used can be for a variety of different PYK2 
variants, including full-length wild type, naturally-occurring variants (e.g., allelic variants 
and splice variants), truncated variants of wild type or naturally-occuring variants, and 
mutants of full-length or truncated wild-type or naturally-occurring variants (that can be 
mutated at one or more sites). For example, in order to provide a PYK2 structure closer to 
a variety of other kinase structures, a mutated PYK2 that includes a mutation to a 
conserved residue in a binding site can be used (or a plurality of such mutations). 

[0061] In another aspect, the invention provides a crystalline form of PYK2, which may 
be a reduced length PYK2 such as a PYK2 kinase domain, e.g., having atomic coordinates 
as described in Table 1 or Table 2. The crystalline form can contain one or more heavy 
metal atoms, for example, atoms useful for X-ray crystallography. The crystalline form 
can also include a binding compound in a co-crystal, e.g., a binding compound that 
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interacts with one more more of PYK2 residues residues 503, 505, 457, 488, 567, and 554 
or any two, any three, any four, any five, or all six of those residues, and can, for example, 
be a compound of Formula I. PYK2 crystals can be in various environments, e.g., in a 
crystallography plate, mounted for X-ray crystallography, and/or in an X-ray beam. The 
PYK2 may be of various forms, e.g., a wild-type, variant, truncated, and/or mutated form 
as described herein. 

[0062] The invention further concerns co-crystals of PYK2, which may a reduced length 
PYK2, e.g. , a P YK2 kinase domain, and a PYK2 binding compound. Advantageously, 
such co-crystals are of sufficient size and quality to allow structural determination of 
PYK2 to at least 3 Angstroms, 2.5 Angstroms, 2.0 Angstroms, or 1.8 Angstroms. The co- 
crystals can, for example, be in a crystallography plate, be mounted for X-ray 
crystallography and/or in an X-ray beam. Such co-crystals are beneficial, for example, for 
obtaining structural information concerning interaction between PYK2 and binding 
compounds. 

[0063] PYK2 binding compounds can include compounds that interact with at least one 
of PYK2 residues 503, 505, 457, 488, 567, and 554, or any 2, 3, 4, 5, or all 6 of those 
residues. Exemplary compounds that bind to PYK2 include compounds of Formula I. 

[0064] Likewise, in additional aspects, methods for obtaining PYK2 crystals and co- 
crystals are provided. In one aspect is provided a method for obtaining a crystal of PYK2 
kinase domain, by subjecting PYK2 kinase domain protein at 5-20 mg/ml, preferably 8-12 
mg/ml, to crystallization condition as described below, or conditions substantially 
equivalent thereto: 

2-10 % (e.g., 8%) polyethylene glycol (PEG) 8000, 0.2 M sodium acetate, 0.1% 
sodium cacodylate pH 6.5, 20% glycerol. 
In general, the PYK2 will be in a solution containing the protein and suitable buffer. For 
example, the solution can contain 20 raM Tris-HCl ph 8.0, 150 mM NaCl, 14 mM p- 
mercaptoethanol (BME), and 1 mM dithiothreitol (DTT). 

[0065] Crystallization conditions can be initially identified using a screening kit, such as 
a Hampton Research (Riverside, CA) screening kit 1 and/or 2. Conditions resulting in 
crystals can be selected and crystallization conditions optimized based on the 
demonstrated crystallization conditions. To assist in subsequent crystallography, the 
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PYK2 can be seleno-methionine labeled. Also, as indicated above, the PYK2 may be any 
of various forms, e.g., truncated to provide a PYK2 kinase domain, which can be selected 
to be of various lengths. 

[0066] In connection with chemical concentrations, the terms "approximately" and 
"about" mean ±20% of the indicated value. 

[0067] In the context of crystallization conditions, the term "substantially equivalent" 
means conditions in a range around identified crystallization conditions such that the 
concentrations of solution components are within ±10% of the stated value, pH is ±1 pH 
unit, preferable ±0.5 pH unit, polymer, salt, and buffer substitutions may be made so long 
as one of ordinary skill in the art of protein crystallization would recognize the solution 
with the substituted component as being likely to also result in crystallization (though re- 
optimization may be useful). An example of such a substitution can be the substitution of 
a particular size PEG with a slightly smaller or larger PEG product, or a mixture of both a 
larger and a smaller PEG product. 

[0068] A related aspect provides a method for obtaining co-crystals of PYK2, which can 
be a reduced length PYK2, with a binding compound, by subjecting PYK2 protein at 5-20 
mg/ml to crystallization conditions substantially equivalent to the conditions as described 
above, in the presence of binding compound, for a time sufficient for cystal development. 
The binding compound may be added at various concentrations depending on the nature of 
the compound, e.g., final concentration of 0.5 to 1.0 mM. In many cases, the binding 
compound will be in an organic solvent such as demethyl sulfoxide solution (DMSO). 
While not preferred, binding compound can also be soaked into a PYK2 crystal, e.g., using 
conventional techniques. 

[0069] In another aspect, provision of compounds active on PYK2 also provides a 
method for modulating PYK2 activity by contacting PYK2 with a compound that binds to 
PYK2 and interacts with one more of residues residues 503, 505, 457, 488, 567, and 554, 
for example a compound of Formula I. The compound is preferably provided at a level 
sufficient to modulate the activity of PYK2 by at least 10%, more preferably at least 20%, 
30%, 40%, or 50%. In many embodiments, the compound will be at a concentration of 
about 1 ^iM, 100 |aM, or 1 mM, or in a range of 1-100 nM, 100-500 nM, 500-1000 nM, 1- 
100 ^M, 100-500 |aM, or 500-1000 ^M. 
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[0070] As used herein, the term "modulating" or "modulate" refers to an effect of 
altering a biological activity, especially a biological activity associated with a particular 
biomolecule such as PYK2. For example, an agonist or antagonist of a particular 
biomolecule modulates the activity of that biomolecule, e.g., an enzyme. 

[0071] The term "PYK2 activity" refers to a biological activity of PYK2, particularly 
including kinase activity. 

[0072] In the context of the use, testing, or screening of compounds that are or may be 
modulators, the term "contacting" means that the compound(s) are caused to be in 
sufficient proximity to a particular molecule, complex, cell, tissue, organism, or other 
specified material that potential binding interactions and/or chemical reaction between the 
compound and other specified material can occur. 

[0073] In a related aspect, the invention provides a method for treating a patient 
suffering from or at risk of a disease or condition for which modulation of PYK2 activity 
provides a therapeutic or prophylactic effect, e.g., a disease or condition characterized by 
abnormal PYK2 kinase activity, where the method involves administering to the patient a 
compound that interacts with at least 2, or three or more of PYK2 residues residues 503, 
505, 457, 488, 567, and 554 (e.g., a compound of Formula I). 

[0074] Specific diseases or disorders which might be treated or prevented cells include: 
myasthenia gravis; neuroblastoma; disorders caused by neuronal toxins such as cholera 
toxin, pertusis toxin, or snake venom; acute megakaryocyte myelosis; thrombocytopenia; 
those of the central nervous system such as seizures, stroke, head trauma, spinal cord 
injury, hypoxia-induced nerve cell damage such as in cardiac arrest or neonatal distress, 
epilepsy, neurodegenerative diseases such as Alzheimer's disease, Huntington's disease 
and Parkinson's disease, dementia, muscle tension, depression, anxiety, panic disorder, 
obsessive-compulsive disorder, post-traumatic stress disor-der, schizophrenia, neuroleptic 
malignant syndrome, and Tourette's syndrome. Conditions that may be treated by PYK2 
inhibitors include epilepsy, schizophrenia, extreme hyperactivity in children, chronic pain, 
and acute pain. Examples of conditions that may be treated by PYK2 enhancers (for 
example a phosphatase inhibitor) include stroke, Alzheimer's, Parkinson's, other 
neurodegenerative diseases, and migraine. 
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[0075] Preferred disorders include epilepsy, stroke, schizophrenia, and Parkinson's 
disorder, as there is a well established relationship between these disorders and the 
function of potassium channels. 

[0076] In addition, PYK2 can act as a target for therapeutics for treating cell 
proliferative diseases. Thus, in certain embodiments, the disease or condition is a 
proliferative disease or neoplasia, such as benign or malignant tumors, psoriasis, 
leukemias (such as myeloblastic leukemia), lymphoma, prostate cancer, liver cancer, 
breast cancer, sarcoma, neuroblastima, Wilm's tumor, bladder cancer, thyroid cancer, 
neoplasias of the epithelialorigin such as mammacarcinoma, a cancer of hematopoietic 
cells, or a chronic inflammatory disease or condition, resulting, for example, from a 
persistent infection (e.g., tuberculosis, syphilis, fungal infection), from prolonged exposure 
to endogenous (e.g., elevated plasma lipids) or exogenous (e.g., silica, asbestos, cigarette 
tar, surgical sutures) toxins, and from autoimmune reactions (e.g., rheumatoid arthritis, 
systemic lupus erythrymatosis, multiple sclerosis, psoriasis). Thus, chronic inflammatory 
diseases include many common medical conditions, such as rheumatoid arthritis, 
restenosis, psoriasis, multiple sclerosis, surgical adhesions, tuberculosis, and chronic 
inflammatory lung and airway diseases, such as asthma pheumoconiosis, chronic 
obstructive pulmonary disease, nasal polyps, and pulmonary fibrosis. PYK2 modulators 
may also be useful in inhibiting development of hematomous plaque and restinosis, in 
controlling restinosis, as anti-metastatic agents, in treating diabetic complications, as 
immunosuppressants, and in control of angiogenesis to the extent a PYK2 kinase is 
involved in a particular disease or condition. 

[0077] As crystals of PYK2 have been developed and analyzed, another aspect concerns 
an electronic representation of PYK2 (which may be a reduced length PYK2), for 
example, an electronic representation containing atomic coordinate representations 
corresponding to the coordinates listed for PYK2 in Table 1 or Table 2, or a schematic 
representation such as one showing secondary structure and/or chain folding, and may also 
show conserved active site residues. The PYK2 may be wild type, an allelic variant, a 
mutant form, or a modifed form, e.g., as described herein. 

[0078] The electronic representation can also be modified by replacing electronic 
representations of particular residues with electronic representations of other residues. 
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Thus, for example, an electronic representation containing atomic coordinate 
representations corresponding to the coordinates for PYK2 listed in Table 1 or Table 2 can 
be modified by the replacement of coordinates for a particular conserved residue in a 
binding site by a different amino acid. Likewise, a PYK2 representation can be modified 
by the respective substitutions, insertions, and/or deletions of amino acid residues to 
provide a representation of a structure for FAK kinase. Following a modification or 
modifications, the representation of the overall structure can be adjusted to allow for the 
known interactions that would be affected by the modification or modifications. In most 
cases, a modification involving more than one residue will be performed in an iterative 
manner. 

[0079] In addition, an electronic representation of a PYK2 binding compound or a test 
compound in the binding site can be included, e.g., a compound of Formula I. 

[0080] Likewise, in a related aspect, the invention concerns an electronic representation 
of a portion of a PYK2 kinase, a binding site (which can be an active site) or kinase 
domain, for example, residues 419-691. A binding site or kinase domain can be 
represented in various ways, e.g., as representations of atomic coordinates of residues 
around the binding site and/or as a binding site surface contour, and can include 
representations of the binding character of particular residues at the binding site, e.g., 
conserved residues. As for electronic representations of PYK2, a binding compound or 
test compound may be present in the binding site; the binding site may be of a wild type, 
variant, mutant form, or modified form of PYK2. 

[0081] In yet another aspect, the structural information of PYK2 can be used in a 
homology model (based on PYK2) for another kinase (such as FAK), thus providing an 
electronic representation of a PYK2 based homology model for a kinase. For example, the 
homology model can utilize atomic coordinates from Table 1 for conserved amino acid 
residues. In particular embodiments; atomic coordinates for a wild type, variant, modified 
form, or mutated form of PYK2 can be used, including, for example, wild type, variants, 
modified forms, and mutant forms as described herein. In particular, PYK2 structure 
provides a very close homology model for FAK kinases. Thus, in particular embodiments 
the invention provides PYK2-based homology models of FAK. 
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[0082] In still another aspect, the invention provides an electronic representation of a 
modified P YK2 crystal structure, that includes an electronic representation of the atomic 
coordinates of a modified PYK2. In an exemplary embodiment, atomic coordinates of 
Table 1 or Table 2 can be modified by the replacement of atomic coordinates for a 
particular amino acid with atomic coordinates for a different amino acid. Modifications 
can include substitutions, deletions (e.g., C-terminal and/or N-terminal delections), 
insertions (internal, C-terminal, and/or N-terminal) and/or side chain modifications. 

[0083] In another aspect, the PYK2 structural information provides a method for 
developing useful biological agents based on PYK2, by analyzing a PYK2 structure to 
identify at least one sub-structure for forming the biological agent. Such sub-structures 
can include epitopes for antibody formation, and the method includes developing 
antibodies against the epitopes, e.g., by injecting an epitope presenting composition in a 
mammal such as a rabbit, guinea pig, pig, goat, or horse. The sub-structure can also 
include a mutation site at which mutation is expected to or is known to alter the activity of 
the PYK2, and the method includes creating a mutation at that site. Still further, the sub- 
structure can include an attachment point for attaching a separate moiety, for example, a 
peptide, a polypeptide, a solid phase material (e.g., beads, gels, chromatographic media, 
slides, chips, plates, and well surfaces), a linker, and a label (e.g., a direct label such as a 
fluorophore or an indirect label, such as biotin or other member of a specific binding pair). 
The method can include attaching the separate moiety. 

[0084] In another aspect, the invention provides a method for identifying potential 
PYK2, binding compounds by fitting at least one electronic representation of a compound 
in an electronic representation of a PYK2 binding site. The representation of the binding 
site may be part of an electronic representation of a larger portion(s) or all of a PYK2 
molecule or may be a representation of only the binding site or active site. The electronic 
representation may be as described above or otherwise described herein. 

[0085] In particular embodiments, the method involves fitting a computer representation 
of a compound from a computer database with a computer representation of the active site 
of a PYK2 kinase, and involves removing a computer representation of a compound 
complexed with the PYK2 molecule and identifying compounds that best fit the active site 
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based on favorable geometric fit and energetically favorable complementary interactions 
as potential binding compounds. 

[0086] In other embodiments, the method involves modifying a computer representation 
of a compound complexed with a PYK2 molecule, by the deletion or addition or both of 
one or more chemical groups; fitting a computer representation of a compound from a 
computer database with a computer representation of the active site of the PYK2 
molecule; and identifying compounds that best fit the active site based on favorable 
geometric fit and energetically favorable complementary interactions as potential binding 
compounds. 

[0087] In still other embodiments, the method involves removing a computer 
representation of a compound complexed with a PYK2 kinase, and searching a database 
for compounds having structural similarity to the complexed compound using a compound 
searching computer program or replacing portions of the complexed compound with 
similar chemical structures using a compound construction computer program. 

[0088] Fitting a compound can include determining whether a compound will interact 
with one or more of PYK2 residues residues 503, 505, 457, 488, 567, and 554. 
Compounds selected for fitting or that are complexed with PYK2 can, for example, be 
compounds of Formula I. 

[0089] In another aspect, the invention concerns a method for attaching a PYK2 kinase 
binding compound to an attachment component, as well as a method for indentifying 
attachment sites on a P YK2 kinase binding compound. The method involves identifying 
energetically allowed sites for attachment of an attachment component for the binding 
compound bound to a binding site of PYK2; and attaching the compound or a derivative 
thereof to the attachment component at the energetically allowed site. 

[0090] As used in connection with binding compounds, an "attachment component" 
refers to a moiety that is attached to a binding compound for adding a functionality other 
than binding with the target molecule and that does not prevent such binding. Examples 
include direct and indirect labels, linkers, and hapten and other specific recognition 
moieties. Linkers (including traceless linkers) can be incorporated, for example, for 
attachment to a solid phase or to another molecule or other moiety. Such attachment can 

23 

DLMR250008.1 



Atty. Dkt. No.: 039363-1202 



be formed by synthesizing the compound or derivative on the linker attached to a solid 
phase medium e.g., in a combinatorial synthesis in a plurality of compound. Likewise, the 
attachment to a solid phase medium can provide an affinity medium (e.g., for affinity 
chromatography). Labels can be a directly detectable label such as a fluorophore, or an 
indirectly detectable such as a member of a specific binding pair, e.g., biotin. 

[0091] The ability to identify energetically allowed sites on a PYK2 kinase binding 
compound also, in a related aspect, provides modified binding compounds that have 
linkers attached, for example, compounds of Formula I, preferably at an energetically 
allowed site for binding of the modified compound to PYK2. The linker can be attached 
to an attachment component as described above. 

[0092] Another aspect concerns a modified PYK2 polypeptide that includes a 
modification that makes the modified PYK2 more similar than native P YK2 to another 
kinase, and can also include other mutations or other modifications. In various 
embodiments, the polypeptide includes a full-length PYK2 polypeptide, includes a 
modified PYK2 binding site, includes at least 20, 30, 40, 50, 60, 70, or 80 contiguous 
amino acid residues derived from PYK2 including a conserved site. 

[0093] Still another aspect of the invention concerns a method for developing a ligand 
for a kinase that includes conserved residues matching any one, 2, 3, 4, 5, or 6 of PYK2 
residues 503, 505, 457, 488, 567, and 554, by determining whether a compound of 
Formula I binds to the kinase. The method can also include determining whether the 
compound modulates the activity of the kinase. Preferably the kinase has at least 50, 55, 
60, or 70% identity over an equal length kinase domain segment. 

[0094] In particular embodiments, the determining includes computer fitting the 
compound in a binding site of the kinase and/or the method includes forming a co-crystal 
of the kinase and the compound. Such co-crystals can be used for determing the binding 
orientation of the compound with the kinase and/or provide structural information on the 
kinase, e.g., on the binding site and interacting amino acid residues. Such binding 
orientation and/or other structural information can be accomplished using X-ray 
crystallography. 
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[0095] Reference to "matching" of a specified conserved amino acid residue in a kinase 
domain means that in a maximal alignment of the amino acid sequences of that kinase 
domain with a different kinase domain, there is an amino acid residue aligned with the 
specified residue that is either the same amino acid or represents a conservative 
substitution. Preferably, the matching amino acid residue is within 5 angstroms rms in an 
overlay of crystal structure atomic coordinates for backbone atoms. 

[0096] The invention also provides compounds that bind to and/or modulate (e.g., 
inhibit) PYK2, e.g., PYK2 kinase activity. Accordingly, in aspects and embodiments 
involving PYK2 binding compounds, molecular scaffolds, and ligands or modulators, the 
compound is a weak binding compound; a moderate binding compound; a strong binding 
compound; the compound interacts with one or more of PYK2 residues 503, 505, 457, 
488, 567, and 554; the compound is a small molecule; the compound binds to a plurality 
of different kinases (e.g., at least 3, 5, 10, 15, 20 different kinases). In particular 
embodiments, the invention concerns compounds of Formula I, as described below. 

[0097] Thus, in certain embodiments, the invention concerns compounds of Formula I: 

\\ // R 3 
N-N 3 

Formula I 

where: 

[0098] R 1 is hydrogen, trifluormethyl, optionally substituted lower alkyl, optionally 
substituted lower alkenyl, optionally substituted lower alkynyl, optionally substituted 
cycloalkyl, optionally substituted heterocycloalkyl, optionally substituted aryl, optionally 
substituted aralkyl, optionally substituted heteroaryl, optionally substituted heteroaralkyl, 
or N R I6 R 17 ; 

[0099] R 2 is hydrogen, optionally substituted lower alkyl, optionally substituted lower 
alkenyl, optionally substituted lower alkynyl, optionally substituted cycloalkyl, optionally 
substituted heterocycloalkyl, optionally substituted aryl, optionally substituted aralkyl, 
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optionally substituted heteroaryl, optionally substituted heteroaralkyl, -C(X)R , 
C(X)NR 16 R 17 , or -S(0 2 )R 21 ; 

[0100] R 3 is hydrogen, trifluoromethyl, optionally substituted alkoxyl, optionally 
substituted thioalkoxy, optionally substituted amine, optionally substituted lower alkyl, 
optionally substituted lower alkenyl, optionally substituted lower alkynyl, optionally 
substituted cycloalkyl, optionally substituted heterocycloalkyl, optionally substituted aryl, 
optionally substituted aralkyl, optionally substituted heteroaryl, or optionally substituted 
heteroaralkyl; 

[0101] R 16 and R 17 are independently hydrogen, optionally substituted lower alkyl, 
optionally substituted lower alkenyl, optionally substituted lower alkynyl, optionally 
substituted cycloalkyl, optionally substituted heterocycloalkyl, optionally substituted aryl, 
optionally substituted aralkyl, optionally substituted heteroaryl, optionally substituted 
heteroaralkyl; 

[0102] R 20 is hydroxyl, optionally substituted lower alkoxy, optionally substituted 
amine, optionally substituted lower alkyl, optionally substituted lower alkenyl, optionally 
substituted lower alkynyl, optionally substituted cycloalkyl, optionally substituted 
heterocycloalkyl, optionally substituted aryl, optionally substituted aralkyl, optionally 
substituted heteroaryl, or optionally substituted heteroaralkyl; 

[0103] R 21 is optionally substituted lower alkoxy, optionally substituted amine, 
optionally substituted lower alkyl, optionally substituted lower alkenyl, optionally 
substituted lower alkynyl, optionally substituted cycloalkyl, optionally substituted 
heterocycloalkyl, optionally substituted aryl, optionally substituted aralkyl, optionally 
substituted heteroaryl, or optionally substituted heteroaralkyl; 

[0104] X = O, or S. 

[0105] Y = S, O, NR 16 R 17 , -C(X)R 20 or optionally substituted alkyl. 

[0106] In Formula I and the descriptions of substituents, subscripts and superscripts are 
to be regarded as equivalent. 
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[0107] In certain embodiments involving compounds of Formula I, X and Y are O; X is 
O and Y is S; X is O and Y is NR ,6 R 17 ; X is O and Y is -C(X)R 20 ; X is S and Y is O; X is 
S and Y is S; X is S and Y is and Y is NR 16 R 17 ; X is S and Y is -C(X)R 20 . 

[0108] In certain embodiments, X = O, Y = O, and R 1 is hydrogen; X = O, Y = O, and 
R 2 is hydrogen; X = O, Y = S, and R 1 is hydrogen; X = O, Y = S, and R 2 is hydrogen; X = 
O, Y = N R 16 R 17 , and R 1 is hydrogen; X = O, Y = S, and R 2 is hydrogen; X = O, Y = N 
R I6 R 17 , and R 2 is hydrogen; X = O, Y = -C(X)R 20 , and R 1 is hydrogen; X = O, Y = - 
C(X)R 20 , and R 2 is hydrogen; X = O, Y = optionally substituted alkyl, and R 1 is hydrogen; 
X = O, Y = optionally substituted alkyl, and R 2 is hydrogen. 

[0109] In certain embodiments, X = S, Y = O, and R 1 is hydrogen; X = S, Y = O, and R 2 
is hydrogen; X = S, Y = S, and R 1 is hydrogen; X = S, Y = S, and R 2 is hydrogen; X = S, 
Y = N R 16 R 17 , and R 1 is hydrogen; X = S, Y = S, and R 2 is hydrogen; X = S, Y = N 
R 16 R 17 , and R 2 is hydrogen; X = S, Y = -C(X)R 20 , and R 1 is hydrogen; X = S, Y = - 
C(X)R 20 , and R 2 is hydrogen; X = S, Y = optionally substituted alkyl, and R 1 is hydrogen; 
X = S, Y = optionally substituted alkyl, and R 2 is hydrogen. 

[0110] In certain embodiments, R 1 is hydrogen, optionally substituted lower alkyl, 
optionally substituted cycloalkyl, or N R 16 R 17 . 

[0111] In certain embodiments, R 2 is hydrogen, optionally substituted lower alkyl, 
optionally substituted cycloalkyl, C(X)NR 16 R 17 , or -S(0 2 )R 21 . 

[0112] An additional aspect of this invention relates to pharmaceutical formulations, that 
include a therapeutically effective amount of a compound of Formula I and at least one 
pharmaceutically acceptable carrier or excipient. The composition can include a plurality 
of different pharmacalogically active compounds. 

[0113] "Halo" or "Halogen" - alone or in combination means all halogens, that is, chloro 
(CI), fluoro (F), bromo (Br), iodo (I). 

[0114] "Hydroxyl" refers to the group -OH. 

[01 15] "Thiol" or "mercapto" refers to the group -SH. 



27 



Atty. Dkt. No.: 039363-1202 



[0116] "Alkyl" - alone or in combination means an alkane-derived radical containing 
from 1 to 20, preferably 1 to 15, carbon atoms (unless specifically defined). It is a straight 
chain alkyl, branched alkyl or cycloalkyl. Preferably, straight or branched alkyl groups 
containing from 1-15, more preferably 1 to 8, even more preferably 1-6, yet more 
preferably 1-4 and most preferably 1-2, carbon atoms, such as methyl, ethyl, propyl, 
isopropyl, butyl, t-butyl and the like. The term "lower alkyl" is used herein to describe the 
straight chain alkyl groups described immediately above. Preferably, cycloalkyl groups 
are monocyclic, bicyclic or tricyclic ring systems of 3-8, more preferably 3-6, ring 
members per ring, such as cyclopropyl, cyclopentyl, cyclohexyl, adamantyl and the like. 
Alkyl also includes a straight chain or branched alkyl group that contains or is interrupted 
by a cycloalkyl portion. The straight chain or branched alkyl group is attached at any 
available point to produce a stable compound. Examples of this include, but are not 
limited to, 4-(isopropyl)-cyclohexylethyl or 2-methyl-cyclopropylpentyl. A substituted 
alkyl is a straight chain alkyl, branched alkyl, or cycloalkyl group defined previously, 
independently substituted with 1 to 3 groups or substituents of halo, hydroxy, alkoxy, 
alkylthio, alkylsulfinyl, alkylsulfonyl, acyloxy, aryloxy, heteroaryloxy, amino optionally 
mono- or di-substituted with alkyl, aryl or heteroaryl groups, amidino, urea optionally 
substituted with alkyl, aryl, heteroaryl or heterocyclyl groups, aminosulfonyl optionally N- 
mono- or N,N-di-substituted with alkyl, aryl or heteroaryl groups, alkylsulfonylamino, 
arylsulfonylamino, heteroarylsulfonylamino, alkylcarbonylamino, arylcarbonylamino, 
heteroarylcarbonylamino, or the like. 

[0117] "Alkenyl" - alone or in combination means a straight, branched, or cyclic 

hydrocarbon containing 2-20, preferably 2-17, more preferably 2-10, even more preferably 

2-8, most preferably 2-4, carbon atoms and at least one, preferably 1-3, more preferably 1- 

2, most preferably one, carbon to carbon double bond. In the case of a cycloalkyl group, 

conjugation of more than one carbon to carbon double bond is not such as to confer 

aromaticity to the ring. Carbon to carbon double bonds may be either contained within a 

cycloalkyl portion, with the exception of cyclopropyl, or within a straight chain or 

branched portion. Examples of alkenyl groups include ethenyl, propenyl, isopropenyl, 

butenyl, cyclohexenyl, cyclohexenylalkyl and the like. A substituted alkenyl is the 

straight chain alkenyl, branched alkenyl or cycloalkenyl group defined previously, 

independently substituted with 1 to 3 groups or substituents of halo, hydroxy, alkoxy, 

alkylthio, alkylsulfinyl, alkylsulfonyl, acyloxy, aryloxy, heteroaryloxy, amino optionally 
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mono- or di-substituted with alkyl, aryl or heteroaryl groups, amidino, urea optionally 
substituted with alkyl, aryl, heteroaryl or heterocyclyl groups, aminosulfonyl optionally N- 
mono- or N,N-di-substituted with alkyl, aryl or heteroaryl groups, alkylsulfonylamino, 
arylsulfonylamino, heteroarylsulfonylamino, alkylcarbonylamino, arylcarbonylamino, 
heteroarylcarbonylamino, carboxy, alkoxycarbonyl, aryloxycarbonyl, 
heteroaryloxycarbonyl, or the like attached at any available point to produce a stable 
compound. 

[0118] "Alkynyl" - alone or in combination means a straight or branched hydrocarbon 
containing 2-20, preferably 2-17, more preferably 2-10, even more preferably 2-8, most 
preferably 2-4, carbon atoms containing at least one, preferably one, carbon to carbon 
triple bond. Examples of alkynyl groups include ethynyl, propynyl, butynyl and the like. 
A substituted alkynyl refers to the straight chain alkynyl or branched alkenyl defined 
previously, independently substituted with 1 to 3 groups or substituents of halo, hydroxy, 
alkoxy, alkylthio, alkylsulfinyl, alkylsulfonyl, acyloxy, aryloxy, heteroaryloxy, amino 
optionally mono- or di-substituted with alkyl, aryl or heteroaryl groups, amidino, urea 
optionally substituted with alkyl, aryl, heteroaryl or heterocyclyl groups, aminosulfonyl 
optionally N-mono- or N,N-di-substituted with alkyl, aryl or heteroaryl groups, 
alkylsulfonylamino, arylsulfonylamino, heteroarylsulfonylamino, alkylcarbonylamino, 
arylcarbonylamino, heteroarylcarbonylamino, or the like attached at any available point to 
produce a stable compound. 

[0119] "Alkyl alkenyl" refers to a group -R-CR'=CR"' R"", where R is lower alkyl, or 
substituted lower alkyl, R', R 5 ", R"" may independently be hydrogen, halogen, lower 
alkyl, substituted lower alkyl, acyl, aryl, substituted aryl, hetaryl, or substituted hetaryl as 
defined below. 

[0120] "Alkyl alkynyl" refers to a groups -RCCR' where R is lower alkyl or substituted 
lower alkyl, R' is hydrogen, lower alkyl, substituted lower alkyl, acyl, aryl, substituted 
aryl, hetaryl, or substituted hetaryl as defined below. 

[0121] "Alkoxy" denotes the group -OR, where R is lower alkyl, substituted lower alkyl, 
acyl, aryl, substituted aryl, aralkyl, substituted aralkyl, heteroalkyl, heteroarylalkyl, 
cycloalkyl, substituted cycloalkyl, cycloheteroalkyl, or substituted cycloheteroalkyl as 
defined. 
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[0122] "Alkylthio" or "thioalkoxy" denotes the group -SR, -S(0) n =i- 2 -R, where R is 
lower alkyl, substituted lower alkyl, aryl, substituted aryl, aralkyl or substituted aralkyl as 
defined herein. 

[0123] "Acyl" denotes groups -C(0)R, where R is hydrogen, lower alkyl substituted 
lower alkyl, aryl, substituted aryl and the like as defined herein. 

[0124] "Aryloxy" denotes groups -OAr, where Ar is an aryl, substituted aryl, heteroaryl, 
or substituted heteroaryl group as defined herein. 

[0125] "Amino" or substituted amine denotes the group NRR', where R and R' may 
independently by hydrogen, lower alkyl, substituted lower alkyl, aryl, substituted aryl, 
hetaryl, or substituted heteroaryl as defined herein, acyl or sulfonyl. 

[0126] "Amido" denotes the group -C(0)NRR', where R and R 5 may independently by 
hydrogen, lower alkyl, substituted lower alkyl, aryl, substituted aryl, hetaryl, substituted 
hetaryl as defined herein. 

[0127] "Carboxyl" denotes the group -C(0)OR, where R is hydrogen, lower alkyl, 
substituted lower alkyl, aryl, substituted aryl, hetaryl, and substituted hetaryl as defined 
herein. 

[0128] "Aryl" - alone or in combination means phenyl or naphthyl optionally 
carbocyclic fused with a cycloalkyl of preferably 5-7, more preferably 5-6, ring members 
and/or optionally substituted with 1 to 3 groups or substituents of halo, hydroxy, alkoxy, 
alkylthio, alkylsulfinyl, alkylsulfonyl, acyloxy, aryloxy, heteroaryloxy, amino optionally 
mono- or di-substituted with alkyl, aryl or heteroaryl groups, amidino, urea optionally 
substituted with alkyl, aryl, heteroaryl or heterocyclyl groups, aminosulfonyl optionally N- 
mono- or N,N-di-substituted with alkyl, aryl or heteroaryl groups, alkylsulfonylamino, 
arylsulfonylamino, heteroarylsulfonylamino, alkylcarbonylamino, arylcarbonylamino, 
heteroarylcarbonylamino, or the like. 

[0129] "Substituted aryl" refers to aryl optionally substituted with one or more 
functional groups, e.g., halogen, lower alkyl, lower alkoxy, alkylthio, acetylene, amino, 
amido, carboxyl, hydroxyl, aryl, aryloxy, heterocycle, heteroaryl, substituted heteroaryl, 
nitro, cyano, thiol, sulfamido and the like. 



30 



Atty. Dkt. No.: 039363-1202 



[0130] "Heterocycle" refers to a saturated, unsaturated, or aromatic carbocyclic group 
having a single ring (e.g., morpholino, pyridyl or furyl) or multiple condensed rings (e.g., 
naphthpyridyl, quinoxalyl, quinolinyl, indolizinyl or benzo[b]thienyl) and having at least 
one hetero atom, such as N, O or S, within the ring, which can optionally be unsubstituted 
or substituted with, e.g., halogen, lower alkyl, lower alkoxy, alkylthio, acetylene, amino, 
amido, carboxyl, hydroxyl, aryl, aryloxy, heterocycle, hetaryl, substituted hetaryl, nitro, 
cyano, thiol, sulfamido and the like. 

[0131] "Heteroaryl" - alone or in combination means a monocyclic aromatic ring 
structure containing 5 or 6 ring atoms, or a bicyclic aromatic group having 8 to 10 atoms, 
containing one or more, preferably 1-4, more preferably 1-3, even more preferably 1-2, 
heteroatoms independently selected from the group O, S, and N, and optionally substituted 
with 1 to 3 groups or substituents of halo, hydroxy, alkoxy, alkylthio, alkylsulfinyl, 
alkylsulfonyl, acyloxy, aryloxy, heteroaryloxy, amino optionally mono- or di-substituted 
with alkyl, aryl or heteroaryl groups, amidino, urea optionally substituted with alkyl, aryl, 
heteroaryl or heterocyclyl groups, aminosulfonyl optionally N-mono- or N,N-di- 
substituted with alkyl, aryl or heteroaryl groups, alkylsulfonylamino, arylsulfonylamino, 
heteroarylsulfonylamino, alkylcarbonylamino, arylcarbonylamino, 

heteroarylcarbonylamino, or the like. Heteroaryl is also intended to include oxidized S or 
N, such as sulfinyl, sulfonyl and N-oxide of a tertiary ring nitrogen. A carbon or nitrogen 
atom is the point of attachment of the heteroaryl ring structure such that a stable aromatic 
ring is retained. Examples of heteroaryl groups are pyridinyl, pyridazinyl, pyrazinyl, 
quinazolinyl, purinyl, indolyl, quinolinyl, pyrimidinyl, pyrrolyl, oxazolyl, thiazolyl, 
thienyl, isoxazolyl, oxathiadiazolyl, isothiazolyl, tetrazolyl, imidazolyl, triazinyl, furanyl, 
benzofuryl, indolyl and the like. A substituted heteroaryl contains a substituent attached at 
an available carbon or nitrogen to produce a stable compound. 

[0132] "Heterocyclyl" - alone or in combination means a non-aromatic cycloalkyl group 
having from 5 to 10 atoms in which from 1 to 3 carbon atoms in the ring are replaced by 
heteroatoms of O, S or N, and are optionally benzo fused or fused heteroaryl of 5-6 ring 
members and/or are optionally substituted as in the case of cycloalkyl. Heterocycyl is also 
intended to include oxidized S or N, such as sulfinyl, sulfonyl and N-oxide of a tertiary 
ring nitrogen. The point of attachment is at a carbon or nitrogen atom. Examples of 
heterocyclyl groups are tetrahydrofuranyl, dihydropyridinyl, piperidinyl, pyrrolidinyl, 
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piperazinyl, dihydrobenzofuryl, dihydroindolyl, and the like. A substituted hetercyclyl 
contains a substituent nitrogen attached at an available carbon or nitrogen to produce a 
stable compound. 

[0133] "Substituted heteroaryl" refers to a heterocycle optionally mono or poly 
substituted with one or more functional groups, e.g., halogen, lower alkyl, lower alkoxy, 
alkylthio, acetylene, amino, amido, carboxyl, hydroxyl, aryl, aryloxy, heterocycle, 
substituted heterocycle, hetaryl, substituted hetaryl, nitro, cyano, thiol, sulfamido and the 
like. 

[0134] "Aralkyl" refers to the group -R-Ar where Ar is an aryl group and R is lower 
alkyl or substituted lower alkyl group. Aryl groups can optionally be unsubstituted or 
substituted with, e.g., halogen, lower alkyl, alkoxy, alkylthio, acetylene, amino, amido, 
carboxyl, hydroxyl, aryl, aryloxy, heterocycle, substituted heterocycle, hetaryl, substituted 
hetaryl, nitro, cyano, thiol, sulfamido and the like. 

[0135] "Heteroalkyl" refers to the group -R-Het where Het is a heterocycle group and R 
is a lower alkyl group. Heteroalkyl groups can optionally be unsubstituted or substituted 
with e.g., halogen, lower alkyl, lower alkoxy, alkylthio, acetylene, amino, amido, 
carboxyl, aryl, aryloxy, heterocycle, substituted heterocycle, hetaryl, substituted hetaryl, 
nitro, cyano, thiol, sulfamido and the like. 

[0136] "Heteroarylalkyl" refers to the group -R-HetAr where HetAr is an heteroaryl 
group and R lower alkyl or substituted lower alkyl. Heteroarylalkyl groups can optionally 
be unsubstituted or substituted with, e.g., halogen, lower alkyl, substituted lower alkyl, 
alkoxy, alkylthio, acetylene, aryl, aryloxy, heterocycle, substituted heterocycle, hetaryl, 
substituted hetaryl, nitro, cyano, thiol, sulfamido and the like. 

[0137] "Cycloalkyl" refers to a divalent cyclic or polycyclic alkyl group containing 3 to 
15 carbon atoms. 

[0138] "Substituted cycloalkyl" refers to a cycloalkyl group comprising one or more 
substituents with, e.g., halogen, lower alkyl, substituted lower alkyl, alkoxy, alkylthio, 
acetylene, aryl, aryloxy, heterocycle, substituted heterocycle, hetaryl, substituted hetaryl, 
nitro, cyano, thiol, sulfamido and the like. 
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[0139] "Cycloheteroalkyl" refers to a cycloalkyl group wherein one or more of the ring 
carbon atoms is replaced with a heteroatom (e.g., N, O, S or P). 

[0140] "Substituted cycloheteroalkyl" refers to a cycloheteroalkyl group as herein 
defined which contains one or more substituents, such as halogen, lower alkyl, lower 
alkoxy, alkylthio, acetylene, amino, amido, carboxyl, hydroxyl, aryl, aryloxy, heterocycle, 
substituted heterocycle, hetaryl, substituted hetaryl, nitro, cyano, thiol, sulfamido and the 
like. 

[0141] "Alkyl cycloalkyl" denotes the group -R-cycloalkyl where cycloalkyl is a 
cycloalkyl group and R is a lower alkyl or substituted lower alkyl. Cycloalkyl groups can 
optionally be unsubstituted or substituted with e.g. halogen, lower alkyl, lower alkoxy, 
alkylthio, acetylene, amino, amido, carboxyl, hydroxyl, aryl, aryloxy, heterocycle, 
substituted heterocycle, hetaryl, substituted hetaryl, nitro, cyano, thiol, sulfamido and the 
like. 

[0142] "Alkyl cycloheteroalkyl" denotes the group -R-cycloheteroalkyl where R is a 
lower alkyl or substituted lower alkyl. Cycloheteroalkyl groups can optionally be 
unsubstituted or substituted with e.g. halogen, lower alkyl, lower alkoxy, alkylthio, amino, 
amido, carboxyl, acetylene, hydroxyl, aryl, aryloxy, heterocycle, substituted heterocycle, 
hetaryl, substituted hetaryl, nitro, cyano, thiol, sulfamido and the like. 

[0143] In addition to compounds (including molecular scaffolds) of Formula I as 
described herein, additional types of compounds can be used as modulators (e.g., 
inhibitors) of PYK2, and for development of further PYK2 ligands. In particular, 
compounds of the types described in Bremer et al., U.S. Application 10/664,421, filed 
September 16, 2003, and Bremer et al., U.S. Application 60/503,277, filed September 15, 
2003, both of which are incorporated herein in their entireties, including drawings. 

[0144] An additional aspect of this invention relates to pharmaceutical formulations, that 
include a therapeutically effective amount of a compound of Formula I, and at least one 
pharmaceutically acceptable carrier or excipient. The composition can include a plurality 
of different pharmacalogically active compounds. 

[0145] Additional aspects and embodiments will be apparent from the following 
Detailed Description and from the claims. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
[0146] FIGURE 1 shows a ribbon diagram schematic representation of PYK2 active site. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 
[0147] The Tables will first be briefly described. 

[0148] Table 1 provides atomic coordinates for human PYK2 kinase domain. In this 
table and in Table 2, the various columns in the lines beginning with "ATOM" have the 
following content, beginning with the left-most column: 
ATOM: Refers to the relevant moiety for the table row. 

Atom number: Refers to the arbitrary atom number designation within the coordinate 
table. 

Atom Name: Identifier for the atom present at the particular coordinates. 

Chain ID: Chain ID refers to one monomer of the protein in the crystal, e.g., chain "A", or 

to other compound present in the crystal, e.g., HOH for water, and L for a ligand or 

binding compound. Multiple copies of the protein monomers will have different chain Ids. 

Residue Number: The amino acid residue number in the chain. 

X, Y, Z: Respectively are the X, Y, and Z coordinate values. 

Occupancy: Describes the fraction of time the atom is observed in the crystal. For 
example, occupancy = 1 means that the atom is present all the time; occupancy = o.5 
indicates that the atom is present in the location 50% of the time. 
B-factor: A measure of the thermal motion of the atom. 
Element: Identifier for the element. 

[0149] In addition, the lines that begin with "ANISOU" present the anisotropic 
temperature factors. The anisotropic temperture factors are related to the corresponding 
isotropic temperature factors (B-factors) in the "ATOM" lines in the table. Following 
"ANISOU", the next 4 entries are "Atom number", "Atom name", Residue name", and 
"Residue number", and are the same as the respective corresponding "ATOM" line 
entries. The next 6 entries are the anisotropic temperature factors U(l,l), U(2,2), U(3,3), 
Ul,2), U(l,3), and U(2,3) in order (scaled by a factor of 10 4 (Angstroms 2 ) and presented 
as integers). 
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[0150] Table 2 provides atomic coordinates for PYK2 with (5- 
adenylylimidodiphosphate) AMPPNP in the binding site. 

[0151] Table 3 provides an alignment of kinase domains for several kinases, including 
human PYK2, providing identification of residues conserved between various members of 
the set. The residue number is for PYK2. 

[0152] Table 4 provides the nucleic acid and amino acid sequences for human PYK2 
kinase domain. 

[0153] Table 5 provides representative assay results for kinase activity of PYK2 kinase 
domain in the presence of ATP and in the presence of several ATP analogs. 

I. Introduction 

[0154] The present invention concerns the use of PYK2 kinase structures, structural 
information, and related compositions for identifying compounds that modulate PYK2 
kinase activity and for determining structuctures of other kinases. 

[0155] PYK2 kinase is involved in a number of disease conditions. For example, as 
indicated in the Background above, PYK2 functions as a neurotransmitter regulator, and 
thus modulation of PYK2 can enhance or inhibit such signaling. In addition, due to the 
involve ment of PYK2 in linking the G protein-coupled pathway with the sos/grb pathway 
for MAP kinase signal tranduction activation. This may involve the binding of src. Thus, 
PYK2 can also affect cell proliferation. 

Exemplary Diseases Associated with PYK2. 

[0156] As indicated above, modulation of PYK2 activity is beneficial for treatment or 

prevention of a variety of diseases and conditions, such as those relating to its roles in 

signal transduction. As a result, PYK2 inhibitors have therapeutic applications in the 

treatment of proliferative diseases, such as various cancers, osteoporosis, and 

inflammation, as well as other disease states, such as those referenced in the Summary 

above and those otherwise indicated herein. PYK2, sceening for PYK2 modulators, and 

methods for using PYK2 modulators, along with related assays, techniques, and data, are 

described, for example, in Duong et al., PCT Application No. PCT/US98/02792, PCT 

Publication WO/98/35056; Schlessinger et al., PCT Application No. PCT/US98/27871, 

35 



Atty. Dkt. No.: 039363-1202 



PCT Publication WO 00/40971; Lev, et al., PCT Application PCT/US97/22565, PCT 
Publication WO 98/26054; Lev et al., PCT Application PCT/US95/15846, PCT 
Publication WO 96/18738, which are incorporated herein in their entireties. 

Osteoporosis 

[0157] Activation of osteoclasts is initiated by adhesion of osteoclast to bone surface. 
Cytoskeletal rearrangement results in formation of a sealing zone and a polarized ruffled 
membrane. Pyk2 was found to be highly expressed in osteoclasts. (Duong et al. (1998) 
"Pyk2 in osteoclasts is an adhesion kinase, localized in the sealing zone, activated by 
ligation of alpha(v)beta3 integrin, and phosphorylated by Src kinase." J. Clin. Invest. 
102:881-892.) Studies indicate that Pyk2 is involved in the adhesion-induced formation of 
the sealing zone and is required for osteoclast bone resorption. (Duong and Rodan (1998) 
Integrin-mediated signaling in the regulation of osteoclast adhesion and activation." Front. 
Biosci. 3:757-768.) 

Proliferative Diseases 

[0158] In another example, modulation of PYK2 has been indicated for treatment of 
proliferative diseases such as cancer, e.g., for cancers of hematopoietic cells, among 
others. (Avraham et al., PCT Publication 98/07870, which is incorporated herein by 
reference in its entirety.) 

Inflammation 

[0159] Modulation of PYK2 has also been linked with treatment of inflammatory 
response-related diseases, generally those that have an aberrent inflammatory response, for 
example, inflammatory bowel diseases such as ulcerative colitis and Crohn's Disease, and 
connective tissue diseases such as rheumatoid arthritis, system lupus erythrmatosus, 
progressive systemin sclerosis, mixed connective tissue disease, and Sjogren's syndrome. 
(Schlessinger et al., PCT Publication WO 00/40971, which is incorporated herein by 
refernce in its entirety.) A pathologic inflammatory response may be a continuation of an 
acute inflammatory response, or a prolonged low-grade inflammatory response, and 
typically results in tissue damage. Macrophage and T-cell recruitment, and process such 
as cytokine production can directly contribute to inflammatory pathogenesis. 
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II. Crystalline PYK2 Kinase 

[0160] Crystalline PYK2 kinases (e.g., human PYK2) include native crystals, kinase 
domain crystals, derivative crystals, and co-crystals. The crystals generally comprise 
substantially pure polypeptides corresponding to the PYK2 kinase polyeptide in crystalline 
form. In connection with the development of inhibitors of PYK2 kinase function, it is 
advantageous to use PYK2 kinase domain for structural determination, because use of the 
reduced sequence simplifies structure determination. To be useful for this purpose, the 
kinase domain should be active and/or retain native-type binding, thus indicating that the 
kinase domain takes on substantially normal 3D structure. 

[0161] It is to be understood that the crystalline kinases and kinase domains useful in the 
the invention are not limited to naturally occurring or native kinase. Indeed, the crystals 
include crystals of mutants of native kinases. Mutants of native kinases are obtained by 
replacing at least one amino acid residue in a native kinase with a different amino acid 
residue, or by adding or deleting amino acid residues within the native polypeptide or at 
the N- or C-terminus of the native polypeptide, and have substantially the same three- 
dimensional structure as the native kinase from which the mutant is derived. 

[0162] By having substantially the same three-dimensional structure is meant having a 
set of atomic structure coordinates that have a root-mean-square deviation of less than or 
equal to about 2A when superimposed with the atomic structure coordinates of the native 
kinase from which the mutant is derived when at least about 50% to 100% of the Ca atoms 
of the native kinase or kinase domain are included in the superposition. 

[0163] Amino acid substitutions, deletions and additions which do not significantly 
interfere with the three-dimensional structure of the kinase will depend, in part, on the 
region of the kinase where the substitution, addition or deletion occurs. In highly variable 
regions of the molecule, non-conservative substitutions as well as conservative 
substitutions may be tolerated without significantly disrupting the three-dimensional, 
structure of the molecule. In highly conserved regions, or regions containing significant 
secondary structure, conservative amino acid substitutions are preferred. Such conserved 
and variable regions can be identified by sequence alignment of PYK2 with other kinases. 
Such alignment of PYK2 kinase domain along with a number of other kinase domains is 
provided in Table 3. 

37 



Atty. Dkt. No.: 039363-1202 



[0164] Conservative amino acid substitutions are well known in the art, and include 
substitutions made on the basis of similarity in polarity, charge, solubility, hydrophobicity, 
hydrophilicity and/or the amphipathic nature of the amino acid residues involved. For 
example, negatively charged amino acids include aspartic acid and glutamic acid; 
positively charged amino acids include lysine and arginine; amino acids with uncharged 
polar head groups having similar hydrophilicity values include the following: leucine, 
isoleucine, valine; glycine, alanine; asparagine, glutamine; serine, threonine; 
phenylalanine, tyrosine. Other conservative amino acid substitutions are well known in 
the art. 

[0165] For kinases obtained in whole or in part by chemical synthesis, the selection of 
amino acids available for substitution or addition is not limited to the genetically encoded 
amino acids. Indeed, the mutants described herein may contain non-genetically encoded 
amino acids. Conservative amino acid substitutions for many of the commonly known 
non-genetically encoded amino acids are well known in the art. Conservative substitutions 
for other amino acids can be determined based on their physical properties as compared to 
the properties of the genetically encoded amino acids. 

[0166] In some instances, it may be particularly advantageous or convenient to 
substitute, delete and/or add amino acid residues to a native kinase in order to provide 
convenient cloning sites in cDNA encoding the polypeptide, to aid in purification of the 
polypeptide, and for crystallization of the polypeptide. Such substitutions, deletions 
and/or additions which do not substantially alter the three dimensional structure of the 
native kinase domain will be apparent to those of ordinary skill in the art. 

[0167] It should be noted that the mutants contemplated herein need not all exhibit 
kinase activity. Indeed, amino acid substitutions, additions or deletions that interfere with 
the kinase activity but which do not significantly alter the three-dimensional structure of 
the domain are specifically contemplated by the invention. Such crystalline polypeptides, 
or the atomic structure coordinates obtained therefrom, can be used to identify compounds 
that bind to the native domain. These compounds can affect the activity of the native 
domain. 

[0168] The derivative crystals of the invention can comprise a crystalline kinase 
polypeptide in covalent association with one or more heavy metal atoms. The polypeptide 
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may correspond to a native or a mutated kinase. Heavy metal atoms useful for providing 
derivative crystals include, by way of example and not limitation, gold, mercury, 
selenium, etc. 

[0169] The co-crystals of the invention generally comprise a crystalline kinase domain 
polypeptide in association with one or more compounds. The association may be covalent 
or non-covalent. Such compounds include, but are not limited to, cofactors, substrates, 
substrate analogues, inhibitors, allosteric effectors, etc. 

[0170] Exemplary mutations for PYK2 family kinases include the insertion of a 
sequence having the FAK sequence shown in the Figure 3 alignment between PYK2 
residues 482 and 483. Such insertion is useful, for example, to assist in using PYK2 
kinases to model FAK kinase. Mutations at other sites can likewise be carried out, e.g., to 
make a mutated PYK2 kinase more similar to another kinase for structure modeling and/or 
compound fitting purposes, such as a kinase in the kinase domain alignment in Table 3. 

III. Three Dimensional Structure Determination Using X-ray Crystallography 

[0171] X-ray crystallography is a method of solving the three dimensional structures of 
molecules. The structure of a molecule is calculated from X-ray diffraction patterns using 
a crystal as a diffraction grating. Three dimensional structures of protein molecules arise 
from crystals grown from a concentrated aqueous solution of that protein. The process of 
X-ray crystallography can include the following steps: 

(a) synthesizing and isolating (or otherwise obtaining) a polypeptide; 

(b) growing a crystal from an aqueous solution comprising the polypeptide with 
or without a modulator; and 

(c) collecting X-ray diffraction patterns from the crystals, determining unit cell 
dimensions and symmetry, determining electron density, fitting the amino 
acid sequence of the polypeptide to the electron density, and refining the 
structure. 

Production of Polypeptides 

[0172] The native and mutated kinase polypeptides described herein may be chemically 
synthesized in whole or part using techniques that are well-known in the art (see, e.g., 
Creighton (1983) Biopolymers 22(l):49-58). 
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[0173] Alternatively, methods which are well known to those skilled in the art can be 
used to construct expression vectors containing the native or mutated kinase polypeptide 
coding sequence and appropriate transcriptional/translational control signals. These 
methods include in vitro recombinant DNA techniques, synthetic techniques and in vivo 
recombination/genetic recombination. See, for example, the techniques described in 
Maniatis, T (1989). Molecular cloning: A laboratory Manual . Cold Spring Harbor 
Laboratory, New York. Cold Spring Harbor Laboratory Press; and Ausubel, F.M. et al. 
(1994) Current Protocols in Molecular Biology . John Wiley & Sons, Secaucus, N.J. 

[0174] A variety of host-expression vector systems may be utilized to express the kinase 
coding sequence. These include but are not limited to microorganisms such as bacteria 
transformed with recombinant bacteriophage DNA, plasmid DNA or cosmid DNA 
expression vectors containing the kinase domain coding sequence; yeast transformed with 
recombinant yeast expression vectors containing the kinase domain coding sequence; 
insect cell systems infected with recombinant virus expression vectors (e.g., baculovirus) 
containing the kinase domain coding sequence; plant cell systems infected with 
recombinant virus expression vectors (e.g., cauliflower mosaic virus, CaMV; tobacco 
mosaic virus, TMV) or transformed with recombinant plasmid expression vectors (e.g., Ti 
plasmid) containing the kinase domain coding sequence; or animal cell systems. The 
expression elements of these systems vary in their strength and specificities. 

[0175] Depending on the host/vector system utilized, any of a number of suitable 
transcription and translation elements, including constitutive and inducible promoters, may 
be used in the expression vector. For example, when cloning in bacterial systems, 
inducible promoters such as pL of bacteriophage X, plac, ptrp, ptac (ptrp-lac hybrid 
promoter) and the like may be used; when cloning in insect cell systems, promoters such 
as the baculovirus polyhedrin promoter may be used; when cloning in plant cell systems, 
promoters derived from the genome of plant cells (e.g., heat shock promoters; the 
promoter for the small subunit of RUBISCO; the promoter for the chlorophyll a/b binding 
protein) or from plant viruses (e.g., the 35S RNA promoter of CaMV; the coat protein 
promoter of TMV) may be used; when cloning in mammalian cell systems, promoters 
derived from the genome of mammalian cells (e.g., metallothionein promoter) or from 
mammalian viruses (e.g., the adenovirus late promoter; the vaccinia virus 7.5K promoter) 
may be used; when generating cell lines that contain multiple copies of the kinase domain 
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DNA, SV40-, BPV- and EBV-based vectors may be used with an appropriate selectable 
marker. 

[0176] Exemplary methods describing methods of DNA manipulation, vectors, various 
types of cells used, methods of incorporating the vectors into the cells, expression 
techniques, protein purification and isolation methods, and protein concentration methods 
are disclosed in detail in PCT publication WO 96/18738. This publication is incorporated 
herein by reference in its entirety, including any drawings. Those skilled in the art will 
appreciate that such descriptions are applicable to the present invention and can be easily 
adapted to it. 

Crystal Growth 

[0177] Crystals are grown from an aqueous solution containing the purified and 
concentrated polypeptide by a variety of techniques. These techniques include batch, 
liquid, bridge, dialysis, vapor diffusion, and hanging and sitting drop methods. McPherson 
(1982) John Wiley, New York; McPherson (1990) Eur. J. Biochem. 189:1-23; Webber 
(1991) Adv. Protein Chem. 41 :l-36, incorporated by reference herein in their entireties, 
including all figures, tables, and drawings. 

[0178] The native crystals of the invention are, in general, grown by adding precipitants 
to the concentrated solution of the polypeptide. The precipitants are added at a 
concentration just below that necessary to precipitate the protein. Water is removed by 
controlled evaporation to produce precipitating conditions, which are maintained until 
crystal growth ceases. 

[0179] For crystals of the invention, exemplary crystallization conditions are described 
in the Examples. Those of ordinary skill in the art will recognize that the exemplary 
crystallization conditions can be varied. Such variations may be used alone or in 
combination. In addition, other crystallization conditions may be found, e.g. , by using 
crystallization screening plates to identify such other conditions. Those alternate 
conditions can then be optimized if needed to provide larger or better quality crystals. 

[0180] Derivative crystals of the invention can be obtained by soaking native crystals in 
mother liquor containing salts of heavy metal atoms. Exemplary conditions for such 
soaking a native crystal utilizes a solution containing about 0.1 mM to about 5 mM 
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thimerosal, 4-chloromeruribenzoic acid or KAu(CN) 2 for about 2 hr to about 72 hr to 
provide derivative crystals suitable for use as isomorphous replacements in determining 
the X-ray crystal structure. 

[0181] Co-crystals of the invention can be obtained by soaking a native crystal in mother 
liquor containing compound that binds the kinase, or can be obtained by co-crystallizing 
the kinase polypeptide in the presence of a binding compound. 

[0182] In many cases, co-crystallization of kinase and binding compound can be 
accomplished using conditions identified for crystallizing the corresponding kinase 
without binding compound. It is advantageous if a plurality of different crystallization 
conditions have been identified for the kinase, and these can be tested to determine which 
condition gives the best co-crystals. It may also be benficial to optimize the conditions for 
co-crystallization. Alternatively, new crystallization conditions can be determined for 
obtaining co-crystals, e.g., by screening for crystallization and then optimizing those 
conditions. Exemplary co-crystallization conditions are provided in the Examples. 

Determining Unit Cell Dimensions and the Three Dimensional Structure of a 
Polypeptide or Polypeptide Complex 

[0183] Once the crystal is grown, it can be placed in a glass capillary tube or other 
mounting device and mounted onto a holding device connected to an X-ray generator and 
an X-ray detection device. Collection of X-ray diffraction patterns are well documented 
by those in the art. See, e.g., Ducruix and Geige, (1992), IRL Press, Oxford, England, and 
references cited therein. A beam of X-rays enters the crystal and then diffracts from the 
crystal. An X-ray detection device can be utilized to record the diffraction patterns 
emanating from the crystal. Although the X-ray detection device on older models of these 
instruments is a piece of film, modern instruments digitally record X-ray diffraction 
scattering. X-ray sources can be of various types, but advantageously, a high intensity 
source is used, e.g., a synchrotron beam source. 

[0184] Methods for obtaining the three dimensional structure of the crystalline form of a 
peptide molecule or molecule complex are well known in the art. See, e.g., Ducruix and 
Geige, (1992), IRL Press, Oxford, England, and references cited therein. The following 
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are steps in the process of determining the three dimensional structure of a molecule or 
complex from X-ray diffraction data. 

[0185] After the X-ray diffraction patterns are collected from the crystal, the unit cell 
dimensions and orientation in the crystal can be determined. They can be determined from 
the spacing between the diffraction emissions as well as the patterns made from these 
emissions. The unit cell dimensions are characterized in three dimensions in units of 
Angstroms (one A= 10" 10 meters) and by angles at each vertices. The symmetry of the unit 
cell in the crystals is also characterized at this stage. The symmetry of the unit cell in the 
crystal simplifies the complexity of the collected data by identifying repeating patterns. 
Application of the symmetry and dimensions of the unit cell is described below. 

[0186] Each diffraction pattern emission is characterized as a vector and the data 
collected at this stage of the method determines the amplitude of each vector. The phases 
of the vectors can be determined using multiple techniques. In one method, heavy atoms 
can be soaked into a crystal, a method called isomorphous replacement, and the phases of 
the vectors can be determined by using these heavy atoms as reference points in the X-ray 
analysis. (Otwinowski, (1991), Daresbury, United Kingdom, 80-86). The isomorphous 
replacement method usually utilizes more than one heavy atom derivative. 

[0187] In another method, the amplitudes and phases of vectors from a crystalline 
polypeptide with an already determined structure can be applied to the amplitudes of the 
vectors from a crystalline polypeptide of unknown structure and consequently determine 
the phases of these vectors. This second method is known as molecular replacement and 
the protein structure which is used as a reference should have a closely related structure to 
the protein of interest. (Naraza (1994) Proteins 1 1 :28 1-296). Thus, the vector 
information from a kinase of known structure, such as those reported herein, are useful for 
the molecular replacement analysis of another kinase with unknown structure. 

[0188] Once the phases of the vectors describing the unit cell of a crystal are determined, 
the vector amplitudes and phases, unit cell dimensions, and unit cell symmetry can be used 
as terms in a Fourier transform function. The Fourier transform function calculates the 
electron density in the unit cell from these measurements. The electron density that 
describes one of the molecules or one of the molecule complexes in the unit cell can be 
referred to as an electron density map. The amino acid structures of the sequence or the 
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molecular structures of compounds complexed with the crystalline polypeptide may then 
be fitted to the electron density using a variety of computer programs. This step of the 
process is sometimes referred to as model building and can be accomplished by using 
computer programs such as Turbo/FRODO or "O". (Jones (1985) Methods in Enzymology 
115:157-171). 

[0189] A theoretical electron density map can then be calculated from the amino acid 
structures fit to the experimentally determined electron density. The theoretical and 
experimental electron density maps can be compared to one another and the agreement 
between these two maps can be described by a parameter called an R-factor. A low value 
for an R-factor describes a high degree of overlapping electron density between a 
theoretical and experimental electron density map. 

[0190] The R-factor is then minimized by using computer programs that refine the 
theoretical electron density map. A computer program such as X-PLOR can be used for 
model refinement by those skilled in the art. (Briinger (1992) Nature 355:472-475.) 
Refinement may be achieved in an iterative process. A first step can entail altering the 
conformation of atoms defined in an electron density map. The conformations of the 
atoms can be altered by simulating a rise in temperature, which will increase the 
vibrational frequency of the bonds and modify positions of atoms in the structure. At a 
particular point in the atomic perturbation process, a force field, which typically defines 
interactions between atoms in terms of allowed bond angles and bond lengths, Van der 
Waals interactions, hydrogen bonds, ionic interactions, and hydrophobic interactions, can 
be applied to the system of atoms. Favorable interactions may be described in terms of 
free energy and the atoms can be moved over many iterations until a free energy minimum 
is achieved. The refinement process can be iterated until the R-factor reaches a minimum 
value. 

[0191] The three dimensional structure of the molecule or molecule complex is 
described by atoms that fit the theoretical electron density characterized by a minimum R- 
value. A file can then be created for the three dimensional structure that defines each atom 
by coordinates in three dimensions. An example of such a structural coordinate file is 
shown in Table 1 . 

IV. Structures of PYK2 
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[0192] The present invention provides high-resolution three-dimensional structures and 
atomic structure coordinates of crystalline PYK2 kinase domain and PYK2 kinase domain 
co-complexed with exemplary binding compounds as determined by X-ray 
crystallography. The methods used to obtain the structure coordinates are provided in the 
examples. The atomic structure coordinates of crystalline PYK2 are listed in Table 1, and 
atomic coordinates for PYK2 co-crystallized with AMPPNP are provided in Table 2. Co- 
crystal coordinates can be used in the same way, e.g., in the various aspects described 
herein, as coordinates for the protein by itself. 

[0193] Those having skill in the art will recognize that atomic structure coordinates as 
determined by X-ray crystallography are not without error. Thus, it is to be understood 
that any set of structure coordinates obtained for crystals of PYK2, whether native 
crystals, kinase domain crystals, derivative crystals or co-crystals, that have a root mean 
square deviation ("r.m.s.d.") of less than or equal to about 1.5 A when superimposed, 
using backbone atoms (N, C a , C and 0), on the structure coordinates listed in Table 1 (or 
Table 2) are considered to be identical with the structure coordinates listed in the Table 1 
(or Table 2) when at least about 50% to 100% of the backbone atoms of PYK2 or PYK2 
kinase domain are included in the superposition. 

V. Uses of the Crystals and Atomic Structure Coordinates 

[0194] The crystals of the invention, and particularly the atomic structure coordinates 
obtained therefrom, have a wide variety of uses. For example, the crystals described 
herein can be used as a starting point in any of the methods of use for kinases known in the 
art or later developed. Such methods of use include, for example, identifying molecules 
that bind to the native or mutated catalytic domain of kinases. The crystals and structure 
coordinates are particularly useful for identifying ligands that modulate kinase activity as 
an approach towards developing new therapeutic agents. In particular, the crystals and 
structural information are useful in methods for ligand development utilizing molecular 
scaffolds. 

[0195] The structure coordinates described herein can be used as phasing models or 
homology models for determining the crystal structures of additional kinases, as well as 
the structures of co-crystals of such kinases with ligands such as inhibitors, agonists, 
antagonists, and other molecules. The structure coordinates, as well as models of the three- 
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dimensional structures obtained therefrom, can also be used to aid the elucidation of 
solution-based structures of native or mutated kinases, such as those obtained via NMR. 

VI. Electronic Representations of Kinase Structures 

[0196] Structural information of kinases or portions of kinases (e.g., kinase active sites) 
can be represented in many different ways. Particularly useful are electronic 
representations, as such representations allow rapid and convenient data manipulations and 
structural modifications. Electronic representations can be embedded in manydifferent 
storage or memory media, frequently computer readable media. Examples include without 
limitations, computer random access memory (RAM), floppy disk, magnetic hard drive, 
magnetic tape (analog or digital), compact disk (CD), optical disk, CD-ROM, memory 
card, digital video disk (DVD), and others. The storage medium can be separate or part of 
a computer system. Such a computer system may be a dedicated, special purpose, or 
embedded system, such as a computer system that forms part of an X-ray crystallography 
system, or may be a general purpose computer (which may have data connection with 
other equipment such as a sensor device in an X-ray crystallographic system. In many 
cases, the information provided by such electronic representations can also be represented 
physically or visually in two or three dimensions, e.g., on paper, as a visual display (e.g., 
on a computer monitor as a two-dimensional or pseudo-three-dimensional image) or as a 
three-dimensional physical model. Such physical representations can also be used, alone 
or in connection with electronic representations. Exemplary useful representations 
include, but are not limited to, the following: 

Atomic Coordinate Representation 

[0197] One type of representation is a list or table of atomic coordinates representing 
positions of particular atoms in a molecular structure, portions of a structure, or complex 
(e.g., a co-crystal). Such a representation may also include additional information, for 
example, information about occupancy of particular coordinates. One such atomic 
coordinate representation contains the coordinate information of Table 1 in electronic 
form. 

Energy Surface or Surface of Interaction Representation 
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[0198] Another representation is an energy surface representation, e.g., of an active site 
or other binding site, representing an energy surface for electronic and steric interactions. 
Such a representation may also include other features. An example is the inclusion of 
representation of a particular amino acid residue(s) or group(s) on a particular amino acid 
residue(s), e.g., a residue or group that can participate in H-bonding or ionic interaction. 
Such energy surface representations can be readily generated from atomic coordinate 
representations using any of a variety of available computer programs. 

Structural Representation 

[0199] Still another representation is a structural representation, i.e., a physical 
representation or an electronic representation of such a physical representation. Such a 
structural representation includes representations of relative positions of particular features 
of a molecule or complex, often with linkage between structural features. For example, a 
structure can be represented in which all atoms are linked; atoms other than hydrogen are 
linked; backbone atoms, with or without representation of sidechain atoms that could 
participate in significant electronic interaction, are linked; among others. However, not all 
features need to be linked. For example, for structural representations of portions of a 
molecule or complex, structural features significant for that feature may be represented 
(e.g., atoms of amino acid residues that can have significant binding interation with a 
ligand at a binding site. Those amino acid residues may not be linked with each other. 

[0200] A structural representation can also be a schematic representation. For example, 
a schematic representation can represent secondary and/or tertiary structure in a schematic 
manner. Within such a schematic representation of a polypeptide, a particular amino acid 
residue(s) or group(s) on a residue(s) can be included, e.g., conserved residues in a binding 
site, and/or residue(s) or group(s) that may interact with binding compounds. Electronic 
structural representations can be generated, for example, from atomic coordinate 
information using computer programs designed for that function and/or by constructing an 
electronic representation with manual input based on interpretation of another form of 
structural information. Physical representations can be created, for example, by printing 
an image of a computer-generated image, by constructing a 3D model. 

VII. Structure Determination for Kinases with Unknown Structure Using 
Structural Coordinates 
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[0201] Structural coordinates, such as those set forth in Table 1, can be used to 
determine the three dimensional structures of kinases with unknown structure. The 
methods described below can apply structural coordinates of a polypeptide with known 
structure to another data set, such as an amino acid sequence, X-ray crystallographic 
diffraction data, or nuclear magnetic resonance (NMR) data. Preferred embodiments of the 
invention relate to determining the three dimensional structures of other serine/threonine 
kinases, and related polypeptides. 

Structures Using Amino Acid Homology 

[0202] Homology modeling is a method of applying structural coordinates of a 
polypeptide of known structure to the amino acid sequence of a polypeptide of unknown 
structure. This method is accomplished using a computer representation of the three 
dimensional structure of a polypeptide or polypeptide complex, the computer 
representation of amino acid sequences of the polypeptides with known and unknown 
structures, and standard computer representations of the structures of amino acids. 
Homology modeling generally involves (a) aligning the amino acid sequences of the 
polypeptides with and without known structure; (b) transferring the coordinates of the 
conserved amino acids in the known structure to the corresponding amino acids of the 
polypeptide of unknown structure; refining the subsequent three dimensional structure; 
and (d) constructing structures of the rest of the polypeptide. One skilled in the art 
recognizes that conserved amino acids between two proteins can be determined from the 
sequence alignment step in step (a). 

[0203] The above method is well known to those skilled in the art. (Greer (1985) Science 
228:1055; Blundell et al A(1988) Eur. J. Biochem. 172:513. An exemplary computer 
program that can be utilized for homology modeling by those skilled in the art is the 
Homology module in the Insight II modeling package distributed by Accelerys Inc. 

[0204] Alignment of the amino acid sequence is accomplished by first placing the 
computer representation of the amino acid sequence of a polypeptide with known structure 
above the amino acid sequence of the polypeptide of unknown structure. Amino acids in 
the sequences are then compared and groups of amino acids that are homologous (e.g., 
amino acid side chains that are similar in chemical nature - aliphatic, aromatic, polar, or 
charged) are grouped together. This method will detect conserved regions of the 
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polypeptides and account for amino acid insertions or deletions. Such alignment and/or 
can also be performed fully electronically using sequence alignment and analyses 
software. 

[0205] Once the amino acid sequences of the polypeptides with known and unknown 
structures are aligned, the structures of the conserved amino acids in the computer 
representation of the polypeptide with known structure are transferred to the 
corresponding amino acids of the polypeptide whose structure is unknown. For example, 
a tyrosine in the amino acid sequence of known structure may be replaced by a 
phenylalanine, the corresponding homologous amino acid in the amino acid sequence of 
unknown structure. 

[0206] The structures of amino acids located in non-conserved regions are to be assigned 
manually by either using standard peptide geometries or molecular simulation techniques, 
such as molecular dynamics. The final step in the process is accomplished by refining the 
entire structure using molecular dynamics and/or energy minimization. The homology 
modeling method is well known to those skilled in the art and has been practiced using 
different protein molecules. For example, the three dimensional structure of the 
polypeptide corresponding to the catalytic domain of a serine/threonine protein kinase, 
myosin light chain protein kinase, was homology modeled from the cAMP-dependent 
protein kinase catalytic subunit. (Knighton et al (1992) Science 258:130-135.) 

Structures Using Molecular Replacement 

[0207] Molecular replacement is a method of applying the X-ray diffraction data of a 
polypeptide of known structure to the X-ray diffraction data of a polypeptide of unknown 
sequence. This method can be utilized to define the phases describing the X-ray .diffraction 
data of a polypeptide of unknown structure when only the amplitudes are known. X-PLOR 
is a commonly utilized computer software package used for molecular replacement. 
Briinger (1992) Nature 355:472-475. AMORE is another program used for molecular 
replacement. Navaza (1994) Acta Crystallogr. A50:157-163. Preferably, the resulting 
structure does not exhibit a root-mean-square deviation of more than 3A. 

[0208] A goal of molecular replacement is to align the positions of atoms in the unit cell 

by matching electron diffraction data from two crystals. A program such as X-PLOR can 

involve four steps. A first step can be to determine the number of molecules in the unit cell 
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and define the angles between them. A second step can involve rotating the diffraction 
data to define the orientation of the molecules in the unit cell. A third step can be to 
translate the electron density in three dimensions to correctly position the molecules in the 
unit cell. Once the amplitudes and phases of the X-ray diffraction data is determined, an 
R-factor can be calculated by comparing electron diffraction maps calculated 
experimentally from the reference data set and calculated from the new data set. An R- 
factor between 30-50% indicates that the orientations of the atoms in the unit cell are 
reasonably determined by this method. A fourth step in the process can be to decrease the 
R-factor to roughly 20% by refining the new electron density map using iterative 
refinement techniques described herein and known to those or ordinary skill in the art. 

Structures Using NMR Data 

[0209] Structural coordinates of a polypeptide or polypeptide complex derived from X- 
ray crystallographic techniques can be applied towards the elucidation of three 
dimensional structures of polypeptides from nuclear magnetic resonance (NMR) data. This 
method is used by those skilled in the art. (Wuthrich, (1986), John Wiley and Sons, New 
York:176-199; Pflugrath et al (1986) J. Mol Biol 189:383-386; Kline et al (1986) J. 
Mol Biol 189:377-382.) While the secondary structure of a polypeptide is often readily 
determined by utilizing two-dimensional NMR data, the spatial connections between 
individual pieces of secondary structure are not as readily determinable. The coordinates 
defining a three-dimensional structure of a polypeptide derived from X-ray 
crystallographic techniques can guide the NMR spectroscopist to an understanding of 
these spatial interactions between secondary structural elements in a polypeptide of related 
structure. 

[0210] The knowledge of spatial interactions between secondary structural elements can 
greatly simplify Nuclear Overhauser Effect (NOE) data from two-dimensional NMR 
experiments. Additionally, applying the crystallographic coordinates after the 
determination of secondary structure by NMR techniques only simplifies the assignment 
of NOEs relating to particular amino acids in the polypeptide sequence and does not 
greatly bias the NMR analysis of polypeptide structure. Conversely, using the 
crystallographic coordinates to simplify NOE data while determining secondary structure 
of the polypeptide would bias the NMR analysis of protein structure. 
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VIII. Structure-Based Design of Modulators of Kinase Function Utilizing 
Structural Coordinates 

[0211] Structure-based modulator design and identification methods are powerful 
techniques that can involve searches of computer databases containing a wide variety of 
potential modulators and chemical functional groups. The computerized design and 
identification of modulators is useful as the computer databases contain more compounds 
than the chemical libraries, often by an order of magnitude. For reviews of structure-based 
drug design and identification (see Kuntz et al. (1994), Acc. Chem. Res. 27:1 17; Guida 
(1994) Current Opinion in Struc. Biol. 4: 777; Colman (1994) Current Opinion in Struc. 
Biol. 4: 868). 

[0212] The three dimensional structure of a polypeptide defined by structural 
coordinates can be utilized by these design methods, for example, the structural 
coordinates of Table 1. In addition, the three dimensional structures of kinases determined 
by the homology, molecular replacement, and NMR techniques described herein can also 
be applied to modulator design and identification methods. 

[0213] For identifying modulators, structural information for a native kinase, in 
particular, structural information for the active site of the kinase, can be used. However, it 
may be advantageous to utilize structural information from one or more co-crystals of the 
kinase with one or more binding compounds. It can also be advantageous if the binding 
compound has a structural core in common with test compounds. 

Design by Searching Molecular Data Bases 

[0214] One method of rational design searches for modulators by docking the computer 
representations of compounds from a database of molecules. Publicly available databases 
include, for example: 

a) ACD from Molecular Designs Limited 

b) NCI from National Cancer Institute 

c) CCDC from Cambridge Crystallographic Data Center 

d) CAST from Chemical Abstract Service 

e) Derwent from Derwent Information Limited 

f) Maybridge from Maybridge Chemical Company LTD 

g) Aldrich from Aldrich Chemical Company 
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h) Directory of Natural Products from Chapman & Hall 

[0215] One such data base (ACD distributed by Molecular Designs Limited Information 
Systems) contains compounds that are synthetically derived or are natural products. 
Methods available to those skilled in the art can convert a data set represented in two 
dimensions to one represented in three dimensions. These methods are enabled by such 
computer programs as CONCORD from Tripos Associates or DE-Converter from 
Molecular Simulations Limited. 

[0216] Multiple methods of structure-based modulator design are known to those in the 
art. (Kuntz et al, (1982), J. Mol Biol 162: 269; Kuntz et aZ., (1994), Acc. Chern. Res. 
27: 117; Meng et al, (1992), J. Compt. Chem. 13: 505; Bohm, (1994), J. Comp. Aided 
Molec. Design 8: 623.) 

[0217] A computer program widely utilized by those skilled in the art of rational 
modulator design is DOCK from the University of California in San Francisco. The 
general methods utilized by this computer program and programs like it are described in 
three applications below. More detailed information regarding some of these techniques 
can be found in the Accelerys User Guide, 1995. A typical computer program used for 
this purpose can perform a processes comprising the following steps or functions: 

(a) remove the existing compound from the protein; 

(b) dock the structure of another compound into the active-site using the computer 

program (such as DOCK) or by interactively moving the compound into the 
active-site; 

(c) characterize the space between the compound and the active-site atoms; 

(d) search libraries for molecular fragments which (i) can fit into the empty space 

between the compound and the active-site, and (ii) can be linked to the 
compound; and 

(e) link the fragments found above to the compound and evaluate the new modified 

compound. 

[0218] Part (c) refers to characterizing the geometry and the complementary interactions 
formed between the atoms of the active site and the compounds. A favorable geometric fit 
is attained when a significant surface area is shared between the compound and active-site 
atoms without forming unfavorable steric interactions. One skilled in the art would note 

52 

DLMR250008.1 



Atty. Dkt. No.: 039363-1202 



that the method can be performed by skipping parts (d) and (e) and screening a database of 
many compounds. 

[0219] Structure-based design and identification of modulators of kinase function can be 
used in conjunction with assay screening. As large computer databases of compounds 
(around 10,000 compounds) can be searched in a matter of hours or even less, the 
computer-based method can narrow the compounds tested as potential modulators of 
kinase function in biochemical or cellular assays. 

[0220] The above descriptions of structure-based modulator design are not all 
encompassing and other methods are reported in the literature and can be used, e.g. : 

(1) CAVEAT: Bartlett et al.,(1989), in Chemical and Biological Problems in 

Molecular Recognition, Roberts, S.M.; Ley, S.V.; Campbell, M.M. eds.; Royal 
Society of Chemistry: Cambridge, pp.182-196. 

(2) FLOG: Miller et al., (1994), J. Comp. Aided Molec. Design 8:153. 

(3) PRO Modulator: Clark et al, (1995), J. Comp. Aided Molec. Design 9:13. 

(4) MCSS: Miranker and Karplus, (1991), Proteins: Structure, Function, and 

Genetics 11:29, 

(5) AUTODOCK: Goodsell and Olson, (1990), Proteins: Structure, Function, and 

Genetics 8:195. 

(6) GRID: Goodford, (1985), J. Med. Chem. 28:849. 

Design by Modifying Compounds in Complex with PYK2 Kinase 

[0221] Another way of identifying compounds as potential modulators is to modify an 
existing modulator in the polypeptide active site. For example, the computer 
representation of modulators can be modified within the computer representation of a 
PYK2 active site. Detailed instructions for this technique can be found, for example, in the 
Accelerys User Manual, 1 995 in LUDI. The computer representation of the modulator is 
typically modified by the deletion of a chemical group or groups or by the addition of a 
chemical group or groups. 

[0222J Upon each modification to the compound, the atoms of the modified compound 

and active site can be shifted in conformation and the distance between the modulator and 

the active-site atoms may be scored along with any complementary interactions formed 

between the two molecules. Scoring can be complete when a favorable geometric fit and 
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favorable complementary interactions are attained. Compounds that have favorable scores 
are potential modulators. 

Design by Modifying the Structure of Compounds that Bind PYK2 Kinase 

[0223] A third method of structure-based modulator design is to screen compounds 
designed by a modulator building or modulator searching computer program. Examples of 
these types of programs can be found in the Molecular Simulations Package, Catalyst. 
Descriptions for using this program are documented in the Molecular Simulations User 
Guide (1995). Other computer programs used in this application are ISIS/HOST, 
ISIS/BASE, ISIS/DRAW) from Molecular Designs Limited and UNITY from Tripos 
Associates. 

[0224] These programs can be operated on the structure of a compound that has been 
removed from the active site of the three dimensional structure of a compound-kinase 
complex. Operating the program on such a compound is preferable since it is in a 
biologically active conformation. 

[0225] A modulator construction computer program is a computer program that may be 
used to replace computer representations of chemical groups in a compound complexed 
with a kinase or other biomolecule with groups from a computer database. A modulator 
searching computer program is a computer program that may be used to search computer 
representations of compounds from a computer data base that have similar three 
dimensional structures and similar chemical groups as compound bound to a particular 
biomolecule. 

[0226] A typical program can operate by using the following general steps: 

(a) map the compounds by chemical features such as by hydrogen bond donors or 

acceptors, hydrophobic/lipophilic sites, positively ionizable sites, or negatively 
ionizable sites; 

(b) add geometric constraints to the mapped features; and 

(c) search databases with the model generated in (b). 

[0227] Those skilled in the art also recognize that not all of the possible chemical 
features of the compound need be present in the model of (b). One can use any subset of 
the model to generate different models for data base searches. 
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Modulator Design Using Molecular Scaffolds 

[0228] The present invention can also advantageously utilize methods for designing 
compounds, designated as molecular scaffolds, that can act broadly across families of 
molecules and/or for using a molecular scaffold to design ligands that target individual or 
multiple members of those families. In preferred embodiments, the molecules can be 
proteins and a set of chemical compounds can be assembled that have properties such that 
they are 1) chemically designed to act on certain protein families and/or 2) behave more 
like molecular scaffolds, meaning that they have chemical substructures that make them 
specific for binding to one or more proteins in a family of interest. Alternatively, 
molecular scaffolds can be designed that are preferentially active on an individual target 
molecule. 

[0229] Useful chemical properties of molecular scaffolds can include one or more of the 
following characteristics, but are not limited thereto: an average molecular weight below 
about 350 daltons, or between from about 150 to about 350 daltons, or from about 150 to 
about 300 daltons; having a clogP below 3; a number of rotatable bonds of less than 4; a 
number of hydrogen bond donors and acceptors below 5 or below 4; a polar surface area 
of less than 50 A 2 ; binding at protein binding sites in an orientation so that chemical 
substituents from a combinatorial library that are attached to the scaffold can be projected 
into pockets in the protein binding site; and possessing chemically tractable structures at 
its substituent attachment points that can be modified, thereby enabling rapid library 
construction. 

[0230] By "clog P" is meant the calculated log P of a compound, "P" referring to the 
partition coefficient between octanol and water. 

[0231] The term "Molecular Polar Surface Area (PSA)" refers to the sum of surface 
contributions of polar atoms (usually oxygens, nitrogens and attached hydrogens) in a 
molecule. The polar surface area has been shown to correlate well with drug transport 
properties, such as intestinal absorption, or blood-brain barrier penetration. 

[0232] Additional useful chemical properties of distinct compounds for inclusion in a 

combinatorial library include the ability to attach chemical moieties to the compound that 

will not interfere with binding of the compound to at least one protein of interest, and that 

will impart desirable properties to the library members, for example, causing the library 

55 



Atty. Dkt. No.: 039363-1202 



members to be actively transported to cells and/or organs of interest, or the ability to 
attach to a device such as a chromatography column (e.g., a streptavidin column through a 
molecule such as biotin) for uses such as tissue and proteomics profiling purposes. 

[0233] A person of ordinary skill in the art will realize other properties that can be 
desirable for the scaffold or library members to have depending on the particular 
requirements of the use, and that compounds with these properties can also be sought and 
identified in like manner. Methods of selecting compounds for assay are known to those 
of ordinary skill in the art, for example, methods and compounds described in U.S. Patent 
No. 6,288,234, 6,090,912, 5,840,485, each of which is hereby incorporated by reference in 
its entirety, including all charts and drawings. 

[0234] In various embodiments, the present invention provides methods of designing 
ligands that bind to a plurality of members of a molecular family, where the ligands 
contain a common molecular scaffold. Thus, a compound set can be assayed for binding 
to a plurality of members of a molecular family, e.g., a protein family. One or more 
compounds that bind to a plurality of family members can be identified as molecular 
scaffolds. When the orientation of the scaffold at the binding site of the target molecules 
has been determined and chemically tractable structures have been identified, a set of 
ligands can be synthesized starting with one or a few molecular scaffolds to arrive at a 
plurality of ligands, wherein each ligand binds to a separate target molecule of the 
molecular family with altered or changed binding affinity or binding specificity relative to 
the scaffold. Thus, a plurality of drug lead molecules can be designed to preferentially 
target individual members of a molecular family based on the same molecular scaffold, 
and act on them in a specific manner. 

IX. Binding Assays 

[0235] The methods of the present invention can involve assays that are able to detect 
the binding of compounds to a target molecule. Such binding is at a statistically 
significant level, preferably with a confidence level of at least 90%, more preferably at 
least 95, 97, 98, 99% or greater confidence level that the assay signal represents binding to 
the target molecule, i.e., is distinguished from background. Preferably controls are used to 
distinguish target binding from non-specific binding. The assays of the present invention 
can also include assaying compounds for low affinity binding to the target molecule. A 
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large variety of assays indicative of binding are known for different target types and can 
be used for this invention. Compounds that act broadly across protein families are not 
likely to have a high affinity against individual targets, due to the broad nature of their 
binding. Thus, assays described herein allow for the identification of compounds that bind 
with low affinity, very low affinity, and extremely low affinity. Therefore, potency (or 
binding affinity) is not the primary, nor even the most important, indicia of identification 
of a potentially useful binding compound. Rather, even those compounds that bind with 
low affinity, very low affinity, or extremely low affinity can be considered as molecular 
scaffolds that can continue to the next phase of the ligand design process. 

[0236] By binding with "low affinity" is meant binding to the target molecule with a 
dissociation constant (kd) of greater than 1 jaM under standard conditions. By binding 
with "very low affinity" is meant binding with a kd of above about 100 |iM under standard 
conditions. By binding with "extremely low affinity" is meant binding at a kd of above 
about 1 mM under standard conditions. By "moderate affinity" is meant binding with a kd 
of from about 200 nM to about 1 ^iM under standard conditions. By "moderately high 
affinity" is meant binding at a kd of from about 1 nM to about 200 nM. By binding at 
"high affinity" is meant binding at a kd of below about 1 nM under standard conditions. 
For example, low affinity binding can occur because of a poorer fit into the binding site of 
the target molecule or because of a smaller number of non-covalent bonds, or weaker 
covalent bonds present to cause binding of the scaffold or ligand to the binding site of the 
target molecule relative to instances where higher affinity binding occurs. The standard 
conditions for binding are at pH 7.2 at 37°C for one hour. For example, 100 |jl/well can be 
used in HEPES 50 mM buffer at pH 7.2, NaCl 15 mM, ATP 2 ^M, and bovine serum 
albumin 1 ug/well, 37°C for one hour. 

[0237] Binding compounds can also be characterized by their effect on the activity of the 
target molecule. Thus, a "low activity" compound has an inhibitory concentration (IC50) 
or excitation concentration (EC50) of greater than 1 ^M under standard conditions. By 
"very low activity" is meant an IC 50 or EC50 of above 100 jiM under standard conditions. 
By "extremely low activity" is meant an IC50 or EC50 of above 1 mM under standard 
conditions. By "moderate activity" is meant an IC 50 or EC 50 of 200 nM to 1 jiM under 
standard conditions. By "moderately high activity" is meant an IC 50 or EC 50 of 1 nM to 
200 nM. By "high activity" is meant an IC 50 or EC50 of below 1 nM under standard 
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conditions. The IC50 (or EC50) is defined as the concentration of compound at which 50% 
of the activity of the target molecule (e.g., enzyme or other protein) activity being 
measured is lost (or gained) relative to activity when no compound is present. Activity 
can be measured using methods known to those of ordinary skill in the art, e.g., by 
measuring any detectable product or signal produced by occurrence of an enzymatic 
reaction, or other activity by a protein being measured. 

[0238] By "background signal" in reference to a binding assay is meant the signal that is 
recorded under standard conditions for the particular assay in the absence of a test 
compound, molecular scaffold, or ligand that binds to the target molecule. Persons of 
ordinary skill in the art will realize that accepted methods exist and are widely available 
for determining background signal. 

[0239] By "standard deviation" is meant the square root of the variance. The variance is 
a measure of how spread out a distribution is. It is computed as the average squared 
deviation of each number from its mean. For example, for the numbers 1,2, and 3, the 
mean is 2 and the variance is: 

c 2 = (l-2) 2 + (2-2) 2 + (3-2 > ) 2 = 0.667 
3 

[0240] To design or discover scaffolds that act broadly across protein families, proteins 
of interest can be assayed against a compound collection or set. The assays can preferably 
be enzymatic or binding assays. In some embodiments it may be desirable to enhance the 
solubility of the compounds being screened and then analyze all compounds that show 
activity in the assay, including those that bind with low affinity or produce a signal with 
greater than about three times the standard deviation of the background signal. The assays 
can be any suitable assay such as, for example, binding assays that measure the binding 
affinity between two binding partners. Various types of screening assays that can be 
useful in the practice of the present invention are known in the art, such as those described 
in U.S. Patent Nos. 5,763,198, 5,747,276, 5,877,007, 6,243,980, 6,294,330, and 6,294,330, 
each of which is hereby incorporated by reference in its entirety, including all charts and 
drawings. 
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[0241] In various embodiments of the assays at least one compound, at least about 5%, 
at least about 10%, at least about 15%, at least about 20%, or at least about 25% of the 
compounds can bind with low affinity. In general, up to about 20% of the compounds can 
show activity in the screening assay and these compounds can then be analyzed directly 
with high-throughput co-crystallography, computational analysis to group the compounds 
into classes with common structural properties (e.g., structural core and/or shape and 
polarity characteristics), and the identification of common chemical structures between 
compounds that show activity. 

[0242] The person of ordinary skill in the art will realize that decisions can be based on 
criteria that are appropriate for the needs of the particular situation, and that the decisions 
can be made by computer software programs. Classes can be created containing almost 
any number of scaffolds, and the criteria selected can be based on increasingly exacting 
criteria until an arbitrary number of scaffolds is arrived at for each class that is deemed to 
be advantageous. 

Surface Plasmon Resonance 

[0243] Binding parameters can be measured using surface plasmon resonance, for 
example, with a BIAcore® chip (Biacore, Japan) coated with immobilized binding 
components. Surface plasmon resonance is used to characterize the microscopic 
association and dissociation constants of reaction between an sFv or other ligand directed 
against target molecules. Such methods are generally described in the following 
references which are incorporated herein by reference. Vely F. et al., (2000) BIAcore® 
analysis to test phosphopeptide-SH2 domain interactions, Methods in Molecular Biology, 
121:313-21; Liparoto et al., (1999) Biosensor analysis of the interleukin-2 receptor 
complex, Journal of Molecular Recognition. 12:316-21; Lipschultz et al., (2000) 
Experimental design for analysis of complex kinetics using surface plasmon resonance, 
Methods. 20(3):310-8; Malmqvist., (1999) BIACORE: an affinity biosensor system for 
characterization of biomolecular interactions, Biochemical Society Transactions 27:335- 
40; Alfthan, (1998) Surface plasmon resonance biosensors as a tool in antibody 
engineering, Biosensors & Bioelectronics. 13:653-63; Fivash et al., (1998) BIAcore for 
macromolecular interaction, Current Opinion in Biotechnology. 9:97-101; Price et al.; 
(1998) Summary report on the ISOBM TD-4 Workshop: analysis of 56 monoclonal 
antibodies against the MUC1 mucin. Tumour Biology 19 Suppl 1:1-20; Malmqvist et al, 
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(1997) Biomolecular interaction analysis: affinity biosensor technologies for functional 
analysis of proteins, Current Opinion in Chemical Biology. 1:378-83; O'Shannessy et aL, 
(1996) Interpretation of deviations from pseudo-first-order kinetic behavior in the 
characterization of ligand binding by biosensor technology, Analytical Biochemistry. 
236:275-83; Malmborg et aL, (1995) BIAcore as a tool in antibody engineering, Journal of 
Immunological Methods. 183:7-13; Van Regenmortel, (1994) Use of biosensors to 
characterize recombinant proteins, Developments in Biological Standardization. 83:143- 
51; and O'Shannessy, (1994) Determination of kinetic rate and equilibrium binding 
constants for macromolecular interactions: a critique of the surface plasmon resonance 
literature, Current Opinions in Biotechnology. 5:65-71. 

[0244] BIAcore® uses the optical properties of surface plasmon resonance (SPR) to 
detect alterations in protein concentration bound to a dextran matrix lying on the surface 
of a gold/glass sensor chip interface, a dextran biosensor matrix. In brief, proteins are 
covalently bound to the dextran matrix at a known concentration and a ligand for the 
protein is injected through the dextran matrix. Near infrared light, directed onto the 
opposite side of the sensor chip surface is reflected and also induces an evanescent wave 
in the gold film, which in turn, causes an intensity dip in the reflected light at a particular 
angle known as the resonance angle. If the refractive index of the sensor chip surface is 
altered (e.g., by ligand binding to the bound protein) a shift occurs in the resonance angle. 
This angle shift can be measured and is expressed as resonance units (RUs) such that 1000 
RUs is equivalent to a change in surface protein concentration of 1 ng/mm 2 . These 
changes are displayed with respect to time along the y-axis of a sensorgram, which depicts 
the association and dissociation of any biological reaction. 

High Throughput Screening (HTS) Assays 

[0245] HTS typically uses automated assays to search through large numbers of 
compounds for a desired activity. Typically HTS assays are used to find new drugs by 
screening for chemicals that act on a particular enzyme or molecule. For example, if a 
chemical inactivates an enzyme it might prove to be effective in preventing a process in a 
cell which causes a disease. High throughput methods enable researchers to assay 
thousands of different chemicals against each target molecule very quickly using robotic 
handling systems and automated analysis of results. 
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[0246] As used herein, "high throughput screening" or "HTS" refers to the rapid in vitro 
screening of large numbers of compounds (libraries); generally tens to hundreds of 
thousands of compounds, using robotic screening assays. Ultra high-throughput Screening 
(uHTS) generally refers to the high-throughput screening accelerated to greater than 
100,000 tests per day. 

[0247] To achieve high-throughput screening, it is advantageous to house samples on a 
multicontainer carrier or platform. A multicontainer carrier facilitates measuring reactions 
of a plurality of candidate compounds simultaneously. Multi-well microplates may be 
used as the carrier. Such multi-well microplates, and methods for their use in numerous 
assays, are both known in the art and commercially available. 

[0248] Screening assays may include controls for purposes of calibration and 
confirmation of proper manipulation of the components of the assay. Blank wells that 
contain all of the reactants but no member of the chemical library are usually included. As 
another example, a known inhibitor (or activator) of an enzyme for which modulators are 
sought, can be incubated with one sample of the assay, and the resulting decrease (or 
increase) in the enzyme activity used as a comparator or control. It will be appreciated 
that modulators can also be combined with the enzyme activators or inhibitors to find 
modulators which inhibit the enzyme activation or repression that is otherwise caused by 
the presence of the known the enzyme modulator. Similarly, when ligands to a 
sphingolipid target are sought, known ligands of the target can be present in 
control/calibration assay wells. 

Measuring Enzymatic and Binding Reactions During Screening Assays 

[0249] Techniques for measuring the progression of enzymatic and binding reactions, 
e.g., in multicontainer carriers, are known in the art and include, but are not limited to, the 
following. 

[0250] Spectrophotometric and spectrofluorometric assays are well known in the art. 
Examples of such assays include the use of colorimetric assays for the detection of 
peroxides, as disclosed in Example 1(b) and Gordon, A. J. and Ford, R. A., (1972) The 
Chemist's Companion: A Handbook Of Practical Data, Techniques. And References . John 
Wiley and Sons, N.Y., Page 437. 
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[0251] Fluorescence spectrometry may be used to monitor the generation of reaction 
products. Fluorescence methodology is generally more sensitive than the absorption 
methodology. The use of fluorescent probes is well known to those skilled in the art. For 
reviews, see Bashford et al., (1987) Spectrophotometry and Spectrofluorometrv: A 
Practical Approach, pp. 91-1 14, IRL Press Ltd.; and Bell, (1981) Spectroscopy In 
Biochemistry, Vol. I, pp. 155-194, CRC Press. 

[0252] In spectrofluorometric methods, enzymes are exposed to substrates that change 
their intrinsic fluorescence when processed by the target enzyme. Typically, the substrate 
is nonfluorescent and is converted to a fluorophore through one or more reactions. As a 
non-limiting example, SMase activity can be detected using the Amplex® Red reagent 
(Molecular Probes, Eugene, OR). In order to measure sphingomyelinase activity using 
Amplex® Red, the following reactions occur. First, SMase hydrolyzes sphingomyelin to 
yield ceramide and phosphorylcholine. Second, alkaline phosphatase hydrolyzes 
phosphorylcholine to yield choline. Third, choline is oxidized by choline oxidase to 
betaine. Finally, H 2 0 25 in the presence of horseradish peroxidase, reacts with Amplex® 
Red to produce the fluorescent product, Resorufin, and the signal therefrom is detected 
using spectrofluorometry. 

[0253] Fluorescence polarization (FP) is based on a decrease in the speed of molecular 
rotation of a fluorophore that occurs upon binding to a larger molecule, such as a receptor 
protein, allowing for polarized fluorescent emission by the bound ligand. FP is 
empirically determined by measuring the vertical and horizontal components of 
fluorophore emission following excitation with plane polarized light. Polarized emission 
is increased when the molecular rotation of a fluorophore is reduced. A fluorophore 
produces a larger polarized signal when it is bound to a larger molecule (i.e. a receptor), 
slowing molecular rotation of the fluorophore. The magnitude of the polarized signal 
relates quantitatively to the extent of fluorescent ligand binding. Accordingly, polarization 
of the "bound" signal depends on maintenance of high affinity binding. 

[0254] FP is a homogeneous technology and reactions are very rapid, taking seconds to 
minutes to reach equilibrium. The reagents are stable, and large batches may be prepared, 
resulting in high reproducibility. Because of these properties, FP has proven to be highly 
automatable, often performed with a single incubation with a single, premixed, tracer- 
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receptor reagent. For a review, see Owickiet al., (1997), Application of Fluorescence 
Polarization Assays in High-Throughput Screening, Genetic Engineering News, 17:27. 

[0255] FP is particularly desirable since its readout is independent of the emission 
intensity (Checovich, W. J., et al., (1995) Nature 375:254-256; Dandliker, W. B., et aL, 
(1981) Methods in Enzymology 74:3-28) and is thus insensitive to the presence of colored 
compounds that quench fluorescence emission. FP and FRET (see below) are well-suited 
for identifying compounds that block interactions between sphingolipid receptors and their 
ligands. See, for example, Parker et al., (2000) Development of high throughput screening 
assays using fluorescence polarization: nuclear receptor-ligand-binding and 
kinase/phosphatase assays, J Biomol Screen 5:77-88. 

[0256] Fluorophores derived from sphingolipids that may be used in FP assays are 
commercially available. For example, Molecular Probes (Eugene, OR) currently sells 
sphingomyelin and one ceramide flurophores. These are, respectively, N-(4,4-difluoro- 
5,7-dimethyl-4-bora-3a,4a-diaza-s-indacene- 3-pentanoyl)sphingosyl phosphocholine 
(BODIPY® FL C5-sphingomyelin); N-(4,4-difluoro-5,7-dimethyl-4-bora-3a,4a-diaza-s- 
indacene- 3-dodecanoyl)sphingosyl phosphocholine (BODIPY® FL C12-sphingomyelin); 
and N-(4,4-difluoro-5,7-dimethyl-4-bora-3a,4a-diaza-s-indacene- 3-pentanoyl)sphingosine 
(BODIPY® FL C5-ceramide). U.S. Patent No. 4,150,949, (Immunoassay for gentamicin), 
discloses fluorescein-labelled gentamicins, including fluoresceinthiocarbanyl gentamicin. 
Additional fluorophores may be prepared using methods well known to the skilled artisan. 

[0257] Exemplary normal-and-polarized fluorescence readers include the POLARION® 
fluorescence polarization system (Tecan AG, Hombrechtikon, Switzerland). General 
multiwell plate readers for other assays are available, such as the VERSAMAX® reader 
and the SPECTRAMAX® multiwell plate spectrophotometer (both from Molecular 
Devices). 

[0258] Fluorescence resonance energy transfer (FRET) is another useful assay for 
detecting interaction and has been described. See, e.g., Heim et al., (1996) Curr. Biol. 
6:178-182; Mitra et al., (1996) Gene 173:13-17; and Selvin et al., (1995) Meth Enzymol 
246:300-345. FRET detects the transfer of energy between two fluorescent substances in 
close proximity, having known excitation and emission wavelengths. As an example, a 
protein can be expressed as a fusion protein with green fluorescent protein (GFP). When 
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two fluorescent proteins are in proximity, such as when a protein specifically interacts 
with a target molecule, the resonance energy can be transferred from one excited molecule 
to the other. As a result, the emission spectrum of the sample shifts, which can be 
measured by a fluorometer, such as a fMAX multiwell fluorometer (Molecular Devices, 
Sunnyvale Calif). 

[02591 Scintillation proximity assay (SPA) is a particularly useful assay for detecting an 
interaction with the target molecule. SPA is widely used in the pharmaceutical industry 
and has been described (Hanselman et al., (1997) J. Lipid Res, 38:2365-2373; Kahl et al., 
(1996) Anal Biochem. 243:282-283; Undenfriend etal., (1987) Anal. Biochem. 161:494- 
500). See also U.S. Patent Nos. 4,626,513 and 4,568,649, and European Patent No. 
0,154,734. One commercially available system uses FLASHPLATE® scintillant-coated 
plates (NEN Life Science Products, Boston, MA). 

[0260] The target molecule can be bound to the scintillator plates by a variety of well 
known means. Scintillant plates are available that are derivatized to bind to fusion 
proteins such as GST, His6 or Flag fusion proteins. Where the target molecule is a protein 
complex or a multimer, one protein or subunit can be attached to the plate first, then the 
other components of the complex added later under binding conditions, resulting in a 
bound complex. 

[0261] In a typical SPA assay, the gene products in the expression pool will have been 
radiolabeled and added to the wells, and allowed to interact with the solid phase, which is 
the immobilized target molecule and scintillant coating in the wells. The assay can be 
measured immediately or allowed to reach equilibrium. Either way, when a radiolabel 
becomes sufficiently close to the scintillant coating, it produces a signal detectable by a 
device such as a TOPCOUNT NXT® microplate scintillation counter (Packard Bioscience 
Co., Meriden Conn.). If a radiolabeled expression product binds to the target molecule, 
the radiolabel remains in proximity to the scintillant long enough to produce a detectable 
signal. 

[0262] In contrast, the labeled proteins that do not bind to the target molecule, or bind 
only briefly, will not remain near the scintillant long enough to produce a signal above 
background. Any time spent near the scintillant caused by random Brownian motion will 
also not result in a significant amount of signal. Likewise, residual unincorporated 
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radiolabel used during the expression step may be present, but will not generate significant 
signal because it will be in solution rather than interacting with the target molecule. These 
non-binding interactions will therefore cause a certain level of background signal that can 
be mathematically removed. If too many signals are obtained, salt or other modifiers can 
be added directly to the assay plates until the desired specificity is obtained (Nichols et al., 
(1998) Anal. Biochem. 257:112-119). 

Assay Compounds and Molecular Scaffolds 

[0263] Preferred characteristics of a scaffold include being of low molecular weight 
(e.g., less than 350 Da, or from about 100 to about 350 daltons, or from about 150 to about 
300 daltons). Preferably clog P of a scaffold is from -1 to 8, more preferably less than 6, 
5, or 4, most preferably less than 3. In particular embodiments the clogP is in a range -1 
to an upper limit of 2, 3, 4, 5, 6, or 8; or is in a range of 0 to an upper limit of 2,3, 4, 5, 6, 
or 8. Preferably the number of rotatable bonds is less than 5, more preferably less than 4. 
Preferably the number of hydrogen bond donors and acceptors is below 6, more preferably 
below 5. An additional criterion that can be useful is a polar surface area of less than 5. 
Guidance that can be useful in identifying criteria for a particular application can be found 
in Lipinski et al., (1997) Advanced Drug Delivery Reviews 23 3-25, which is hereby 
incorporated by reference in its entirety. 

[0264] A scaffold may preferably bind to a given protein binding site in a configuration 
that causes substituent moieties of the scaffold to be situated in pockets of the protein 
binding site. Also, possessing chemically tractable groups that can be chemically 
modified, particularly through synthetic reactions, to easily create a combinatorial library 
can be a preferred characteristic of the scaffold. Also preferred can be having positions on 
the scaffold to which other moieties can be attached, which do not interfere with binding 
of the scaffold to the protein(s) of interest but do cause the scaffold to achieve a desirable 
property, for example, active transport of the scaffold to cells and/or organs, enabling the 
scaffold to be attached to a chromatographic column to facilitate analysis, or another 
desirable property. A molecular scaffold can bind to a target molecule with any affinity, 
such as binding at high affinity, moderate affinity, low affinity, very low affinity, or 
extremely low affinity. 
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[0265] Thus, the above criteria can be utilized to select many compounds for testing that 
have the desired attributes. Many compounds having the criteria described are available in 
the commercial market, and may be selected for assaying depending on the specific needs 
to which the methods are to be applied. 

[0266] A "compound library" or "library" is a collection of different compounds having 
different chemical structures. A compound library is screenable, that is, the compound 
library members therein may be subject to screening assays. In preferred embodiments, 
the library members can have a molecular weight of from about 100 to about 350 daltons, 
or from about 150 to about 350 daltons. Examples of libraries are provided above. 

[0267] Libraries of the present invention can contain at least one compound than binds 
to the target molecule at low affinity. Libraries of candidate compounds can be assayed 
by many different assays, such as those described above, e.g., a fluorescence polarization 
assay. Libraries may consist of chemically synthesized peptides, peptidomimetics, or 
arrays of combinatorial chemicals that are large or small, focused or nonfocused. By 
"focused" it is meant that the collection of compounds is prepared using the structure of 
previously characterized compounds and/or pharmacophores. 

[0268] Compound libraries may contain molecules isolated from natural sources, 
artificially synthesized molecules, or molecules synthesized, isolated, or otherwise 
prepared in such a manner so as to have one or more moieties variable, e.g., moieties that 
are independently isolated or randomly synthesized. Types of molecules in compound 
libraries include but are not limited to organic compounds, polypeptides and nucleic acids 
as those terms are used herein, and derivatives, conjugates and mixtures thereof. 

[0269] Compound libraries of the invention may be purchased on the commercial market 
or prepared or obtained by any means including, but not limited to, combinatorial 
chemistry techniques, fermentation methods, plant and cellular extraction procedures and 
the like (see, e.g., Cwirla et aL, (1990) Biochemistry, 87, 6378-6382; Houghten et al., 
(1991) Nature, 354, 84-86; Lam et al., (1991) Nature, 354, 82-84; Brenner et al., (1992) 
Proc. Natl Acad. Set USA, 89, 5381-5383; R. A. Houghten, (1993) Trends Genet., 9, 235- 
239; E. R. Felder, (1994) Chimin, 48, 512-541; Gallop et aL, (1994) J. Med. Chem., 37, 
1233-1251; Gordon et al., (1994) J. Med. Chem. , 37, 1385-1401; Carell et al., (1995) 
Chem. Biol., 3, 171-183; Madden et al., Perspectives in Drug Discovery and Design 2, 
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269-282; Lebl et al., (1995) Biopolymers, 37 177-198); small molecules assembled around 
a shared molecular structure; collections of chemicals that have been assembled by various 
commercial and noncommercial groups, natural products; extracts of marine organisms, 
fungi, bacteria, and plants. 

[0270] Preferred libraries can be prepared in a homogenous reaction mixture, and 
separation of unreacted reagents from members of the library is not required prior to 
screening. Although many combinatorial chemistry approaches are based on solid state 
chemistry, liquid phase combinatorial chemistry is capable of generating libraries (Sim 
CM., (1999) Recent advances in liquid-phase combinatorial chemistry, Combinatorial 
Chemistry & High Throughput Screening. 2:299-318). 

[0271] Libraries of a variety of types of molecules are prepared in order to obtain 
members therefrom having one or more preselected attributes that can be prepared by a 
variety of techniques, including but not limited to parallel array synthesis (Houghton, 
(2000) Annu Rev Pharmacol Toxicol 40:273-82, Parallel array and mixture-based 
synthetic combinatorial chemistry; solution-phase combinatorial chemistry (Merritt, 

(1998) Comb Chem High Throughput Screen l(2):57-72, Solution phase combinatorial 
chemistry, Coe et al., (1998-99) Mol D/ver.s;4(l):31-8, Solution-phase combinatorial 
chemistry, Sun, (1999) Comb Chem High Throughput Screen 2(6):299-318, Recent 
advances in liquid-phase combinatorial chemistry); synthesis on soluble polymer (Gravert 
et al., (1997) Curr Opin Chem Biol 1(1): 107-13, Synthesis on soluble polymers: new 
reactions and the construction of small molecules); and the like. See, e.g., Dolle et aL, 

(1999) J Comb Chem l(4):235-82, Comprehensive survey of cominatorial library 
synthesis: 1998. Freidinger RM., (1999) Nonpeptidic ligands for peptide and protein 
receptors, Current Opinion in Chemical Biology; and Kundu et al., Prog Drug itey;53:89- 
156, Combinatorial chemistry: polymer supported synthesis of peptide and non-peptide 
libraries). Compounds may be clinically tagged for ease of identification (Chabala, (1995) 
Curr Opin Biotechnol 6(6):633-9, Solid-phase combinatorial chemistry and novel tagging 
methods for identifying leads). 

[0272] The combinatorial synthesis of carbohydrates and libraries containing 
oligosaccharides have been described (Schweizer et al., (1999) Curr Opin Chem Biol 
3(3):291-8, Combinatorial synthesis of carbohydrates). The synthesis of natural-product 
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based compound libraries has been described (Wessjohann, (2000) Curr Opin Chem Biol 
4(3):303-9, Synthesis of natural-product based compound libraries). 

[0273] Libraries of nucleic acids are prepared by various techniques, including by way 
of non-limiting example the ones described herein, for the isolation of aptamers. Libraries 
that include oligonucleotides and polyaminooligonucleotides (Markiewicz et al., (2000) 
Synthetic oligonucleotide combinatorial libraries and their applications, Farmaco. 55:174- 
7) displayed on streptavidin magnetic beads are known. Nucleic acid libraries are known 
that can be coupled to parallel sampling and be deconvoluted without complex procedures 
such as automated mass spectrometry (Enjalbal C. Martinez J. Aubagnac JL, (2000) 
Mass spectrometry in combinatorial chemistry, Mass Spectrometry Reviews. 19:1 39-6 1 ) 
and parallel tagging. (Perrin DM., Nucleic acids for recognition and catalysis: landmarks, 
limitations, and looking to the future, Combinatorial Chemistry & High Throughput 
Screening 3:243-69). 

[0274] Peptidomimetics are identified using combinatorial chemistry and solid phase 
synthesis (Kim HO. Kahn M., (2000) A merger of rational drug design and combinatorial 
chemistry: development and application of peptide secondary structure mimetics, 
Combinatorial Chemistry & High Throughput Screening 3:167-83; al-Obeidi, (1998) Mol 
Biotechnol 9(3):205-23, Peptide and peptidomimetric libraries. Molecular diversity and 
drug design). The synthesis may be entirely random or based in part on a known 
polypeptide. 

[0275] Polypeptide libraries can be prepared according to various techniques. In brief, 
phage display techniques can be used to produce polypeptide ligands (Gram H., (1999) 
Phage display in proteolysis and signal transduction, Combinatorial Chemistry & High 
Throughput Screening. 2:19-28) that may be used as the basis for synthesis of 
peptidomimetics. Polypeptides, constrained peptides, proteins, protein domains, 
antibodies, single chain antibody fragments, antibody fragments, and antibody combining 
regions are displayed on filamentous phage for selection. 

[0276] Large libraries of individual variants of human single chain Fv antibodies have 
been produced. See, e.g., Siegel RW. Allen B. Pavlik P. Marks JD. Bradbury A., (2000) 
Mass spectral analysis of a protein complex using single-chain antibodies selected on a 
peptide target: applications to functional genomics, Journal of Molecular Biology 
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302:285-93; Poul MA. Becerril B. Nielsen UB. Morisson P. Marks JD.,(2000) Selection 
of tumor-specific internalizing human antibodies from phage libraries. Source Journal of 
Molecular Biology. 301:1 149-61; Amersdorfer P. Marks JD., (2001) Phage libraries for 
generation of anti-botulinum scFv antibodies, Methods in Molecular Biology. 145:219-40; 
Hughes- Jones NC. Bye JM. Gorick BD. Marks JD. Ouwehand WH., (1999) Synthesis of 
Rh Fv phage-antibodies using VH and VL germline genes, British Journal of 
Haematology. 105:81 1-6; McCall AM. Amoroso AR. Sautes C. Marks JD. Weiner LM., 
(1998) Characterization of anti-mouse Fc gamma RII single-chain Fv fragments derived 
from human phage display libraries, Immunotechnology. 4:71-87; Sheets MD. 
Amersdorfer P. Finnern R. Sargent P. Lindquist E. Schier R. Hemingsen G. Wong C. 
Gerhart JC. Marks JD. Lindquist E., (1998) Efficient construction of a large nonimmune 
phage antibody library: the production of high-affinity human single-chain antibodies to 
protein antigens (published erratum appears in Proc Natl Acad Sci USA 1999 96:795), 
Proc Natl Acad Sci USA 95:6157-62). 

[0277] Focused or smart chemical and pharmacophore libraries can be designed with the 
help of sophisticated strategies involving computational chemistry (e.g., Kundu B. Khare 
SK. Rastogi SK., (1999) Combinatorial chemistry: polymer supported synthesis of 
peptide and non-peptide libraries, Progress in Drug Research 53:89-156) and the use of 
structure-based ligands using database searching and docking, de novo drug design and 
estimation of ligand binding affinities (Joseph-McCarthy D., (1999) Computational 
approaches to structure-based ligand design, Pharmacology & Therapeutics 84:179-91; 
Kirkpatrick DL. Watson S. Ulhaq S., (1999) Structure-based drug design: combinatorial 
chemistry and molecular modeling, Combinatorial Chemistry & High Throughput 
Screening. 2:21 1-21 ; Eliseev AV. Lehn JM., (1999) Dynamic combinatorial chemistry: 
evolutionary formation and screening of molecular libraries, Current Topics in 
Microbiology & Immunology 243:159-72; Bolger et aL, (1991) Methods Enz. 203:21-45; 
Martin, (1991) Methods Enz. 203:587-613; Neidle et aL, (1991) Methods Enz. 203:433- 
458; U.S. Patent 6,178,384). 

X. Crystallography 

[0278] After binding compounds have been determined, the orientation of compound 
bound to target is determined. Preferably this determination involves crystallography on 
co-crystals of molecular scaffold compounds with target. Most protein crystallographic 
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platforms can preferably be designed to analyze up to about 500 co-complexes of 
compounds, ligands, or molecular scaffolds bound to protein targets due to the physical 
parameters of the instruments and convenience of operation. If the number of scaffolds 
that have binding activity exceeds a number convenient for the application of 
crystallography methods, the scaffolds can be placed into groups based on having at least 
one common chemical structure or other desirable characteristics, and representative 
compounds can be selected from one or more of the classes. Classes can be made with 
increasingly exacting criteria until a desired number of classes (e.g., 500) is obtained. The 
classes can be based on chemical structure similarities between molecular scaffolds in the 
class, e.g., all possess a pyrrole ring, benzene ring, or other chemical feature. Likewise, 
classes can be based on shape characteristics, e.g., space-filling characteristics. 

[0279] The co-crystallography analysis can be performed by co-complexing each 
scaffold with its target at concentrations of the scaffold that showed activity in the 
screening assay. This co-complexing can be accomplished with the use of low percentage 
organic solvents with the target molecule and then concentrating the target with each of 
the scaffolds. In preferred embodiments these solvents are less than 5% organic solvent 
such as dimethyl sulfoxide (DMSO), ethanol, methanol, or ethylene glycol in water or 
another aqueous solvent. Each scaffold complexed to the target molecule can then be 
screened with a suitable number of crystallization screening conditions at both 4 and 20 
degrees. In preferred embodiments, about 96 crystallization screening conditions can be 
performed in order to obtain sufficient information about the co-complexation and 
crystallization conditions, and the orientation of the scaffold at the binding site of the 
target molecule. Crystal structures can then be analyzed to determine how the bound 
scaffold is oriented physically within the binding site or within one or more binding 
pockets of the molecular family member. 

[0280] It is desirable to determine the atomic coordinates of the compounds bound to the 
target proteins in order to determine which is a most suitable scaffold for the protein 
family. X-ray crystallographic analysis is therefore most preferable for determining the 
atomic coordinates. Those compounds selected can be further tested with the application 
of medicinal chemistry. Compounds can be selected for medicinal chemistry testing based 
on their binding position in the target molecule. For example, when the compound binds 
at a binding site, the compound's binding position in the binding site of the target 
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molecule can be considered with respect to the chemistry that can be performed on 
chemically tractable structures or sub-structures of the compound, and how such 
modifications on the compound might interact with structures or sub-structures on the 
binding site of the target. Thus, one can explore the binding site of the target and the 
chemistry of the scaffold in order to make decisions on how to modify the scaffold to 
arrive at a ligand with higher potency and/or selectivity. This process allows for more 
direct design of ligands, by utilizing structural and chemical information obtained directly 
from the co-complex, thereby enabling one to more efficiently and quickly design lead 
compounds that are likely to lead to beneficial drug products. In various embodiments it 
may be desirable to perform co-crystallography on all scaffolds that bind, or only those 
that bind with a particular affinity, for example, only those that bind with high affinity, 
moderate affinity, low affinity, very low affinity, or extremely low affinity. It may also be 
advantageous to perform co-crystallography on a selection of scaffolds that bind with any 
combination of affinities. 

[0281] Standard X-ray protein diffraction studies such as by using a Rigaku RU-200® 
(Rigaku, Tokyo, Japan) with an X-ray imaging plate detector or a synchrotron beam-line 
can be performed on co-crystals and the diffraction data measured on a standard X-ray 
detector, such as a CCD detector or an X-ray imaging plate detector. 

[0282] Performing X-ray crystallography on about 200 co-crystals should generally lead 
to about 50 co-crystals structures, which should provide about 10 scaffolds for validation 
in chemistry, which should finally result in about 5 selective leads for target molecules. 

Virtual Assays 

[0283] Commercially available software that generates three-dimensional graphical 
representations of the complexed target and compound from a set of coordinates provided 
can be used to illustrate and study how a compound is oriented when bound to a target, 
(e.g., QUANTA®, Accelerys, San Diego, CA). Thus, the existence of binding pockets at 
the binding site of the targets can be particularly useful in the present invention. These 
binding pockets are revealed by the crystallographic structure determination and show the 
precise chemical interactions involved in binding the compound to the binding site of the 
target. The person of ordinary skill will realize that the illustrations can also be used to 
decide where chemical groups might be added, substituted, modified, or deleted from the 



71 



Atty. Dkt. No.: 039363-1202 



scaffold to enhance binding or another desirable effect, by considering where unoccupied 
space is located in the complex and which chemical substructures might have suitable size 
and/or charge characteristics to fill it. The person of ordinary skill will also realize that 
regions within the binding site can be flexible and its properties can change as a result of 
scaffold binding, and that chemical groups can be specifically targeted to those regions to 
achieve a desired effect. Specific locations on the molecular scaffold can be considered 
with reference to where a suitable chemical substructure can be attached and in which 
conformation, and which site has the most advantageous chemistry available. 

[0284] An understanding of the forces that bind the compounds to the target proteins 
reveals which compounds can most advantageously be used as scaffolds, and which 
properties can most effectively be manipulated in the design of ligands. The person of 
ordinary skill will realize that steric, ionic, hydrogen bond, and other forces can be 
considered for their contribution to the maintenance or enhancement of the target- 
compound complex. Additional data can be obtained with automated computational 
methods, such as docking and/or Free Energy Perturbations (FEP), to account for other 
energetic effects such as desolvation penalties. The compounds selected can be used to 
generate information about the chemical interactions with the target or for elucidating 
chemical modifications that can enhance selectivity of binding of the compound. 

[0285] Computer models, such as homology models (i.e., based on a known, 
experimentally derived structure) can be constructed using data from the co-crystal 
structures. When the target molecule is a protein or enzyme, preferred co-crystal 
structures for making homology models contain high sequence identity in the binding site 
of the protein sequence being modeled, and the proteins will preferentially also be within 
the same class and/or fold family. Knowledge of conserved residues in active sites of a 
protein class can be used to select homology models that accurately represent the binding 
site. Homology models can also be used to map structural information from a surrogate 
protein where an apo or co-crystal structure exists to the target protein. 

[0286] Virtual screening methods, such as docking, can also be used to predict the 
binding configuration and affinity of scaffolds, compounds, and/or combinatorial library 
members to homology models. Using this data, and carrying out "virtual experiments" 
using computer software can save substantial resources and allow the person of ordinary 
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skill to make decisions about which compounds can be suitable scaffolds or ligands, 
without having to actually synthesize the ligand and perform co-crystallization. Decisions 
thus can be made about which compounds merit actual synthesis and co-crystallization. 
An understanding of such chemical interactions aids in the discovery and design of drugs 
that interact more advantageously with target proteins and/or are more selective for one 
protein family member over others. Thus, applying these principles, compounds with 
superior properties can be discovered. 

[0287] Additives that promote co-crystallization can of course be included in the target 
molecule formulation in order to enhance the formation of co-crystals. In the case of 
proteins or enzymes, the scaffold to be tested can be added to the protein formulation, 
which is preferably present at a concentration of approximately 1 mg/ml. The formulation 
can also contain between 0%-10% (v/v) organic solvent, e.g. DMSO, methanol, ethanol, 
propane diol, or 1,3 dimethyl propane diol (MPD) or some combination of those organic 
solvents. Compounds are preferably solubilized in the organic solvent at a concentration 
of about 10 mM and added to the protein sample at a concentration of about 100 mM. The 
protein-compound complex is then concentrated to a final concentration of protein of from 
about 5 to about 20 mg/ml. The complexation and concentration steps can conveniently 
be performed using a 96-well formatted concentration apparatus (e.g., Amicon Inc., 
Piscataway, NJ). Buffers and other reagents present in the formulation being crystallized 
can contain other components that promote crystallization or are compatible with 
crystallization conditions, such as DTT, propane diol, glycerol. 

[0288] The crystallization experiment can be set-up by placing small aliquots of the 
concentrated protein-compound complex (1 jxl) in a 96 well format and sampling under 96 
crystallization conditions. (Other screening formats can also be used, e.g., plates with 
greater than 96 wells.) Crystals can typically be obtained using standard crystallization 
protocols that can involve the 96 well crystallization plate being placed at different 
temperatures. Co-crystallization varying factors other than temperature can also be 
considered for each protein-compound complex if desirable. For example, atmospheric 
pressure, the presence or absence of light or oxygen, a change in gravity, and many other 
variables can all be tested. The person of ordinary skill in the art will realize other 
variables that can advantageously be varied and considered. 
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Ligand Design and Preparation 

[0289] The design and preparation of ligands can be performed with or without 
structural and/or co-crystallization data by considering the chemical structures in common 
between the active scaffolds of a set. In this process structure-activity hypotheses can be 
formed and those chemical structures found to be present in a substantial number of the 
scaffolds, including those that bind with low affinity, can be presumed to have some effect 
on the binding of the scaffold. This binding can be presumed to induce a desired 
biochemical effect when it occurs in a biological system (e.g., a treated mammal). New or 
modified scaffolds or combinatorial libraries derived from scaffolds can be tested to 
disprove the maximum number of binding and/or structure-activity hypotheses. The 
remaining hypotheses can then be used to design ligands that achieve a desired binding 
and biochemical effect. 

[0290] But in many cases it will be preferred to have co-crystallography data for 
consideration of how to modify the scaffold to achieve the desired binding effect (e.g., 
binding at higher affinity or with higher selectivity). Using the case of proteins and 
enzymes, co-crystallography data shows the binding pocket of the protein with the 
molecular scaffold bound to the binding site, and it will be apparent that a modification 
can be made to a chemically tractable group on the scaffold. For example, a small volume 
of space at a protein binding site or pocket might be filled by modifying the scaffold to 
include a small chemical group that fills the volume. Filling the void volume can be 
expected to result in a greater binding affinity, or the loss of undesirable binding to 
another member of the protein family. Similarly, the co-crystallography data may show 
that deletion of a chemical group on the scaffold may decrease a hindrance to binding and 
result in greater binding affinity or specificity. 

[0291] It can be desirable to take advantage of the presence of a charged chemical group 
located at the binding site or pocket of the protein. For example, a positively charged 
group can be complemented with a negatively charged group introduced on the molecular 
scaffold. This can be expected to increase binding affinity or binding specificity, thereby 
resulting in a more desirable ligand. In many cases, regions of protein binding sites or 
pockets are known to vary from one family member to another based on the amino acid 
differences in those regions. Chemical additions in such regions can result in the creation 
or elimination of certain interactions (e.g., hydrophobic, electrostatic, or entropic) that 
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allow a compound to be more specific for one protein target over another or to bind with 
greater affinity, thereby enabling one to synthesize a compound with greater selectivity or 
affinity for a particular family member. Additionally, certain regions can contain amino 
acids that are known to be more flexible than others. This often occurs in amino acids 
contained in loops connecting elements of the secondary structure of the protein, such as 
alpha helices or beta strands. Additions of chemical moieties can also be directed to these 
flexible regions in order to increase the likelihood of a specific interaction occurring 
between the protein target of interest and the compound. Virtual screening methods can 
also be conducted in silico to assess the effect of chemical additions, subtractions, 
modifications, and/or substitutions on compounds with respect to members of a protein 
family or class. 

[0292] The addition, subtraction, or modification of a chemical structure or sub-structure 
to a scaffold can be performed with any suitable chemical moiety. For example the 
following moieties, which are provided by way of example and are not intended to be 
limiting, can be utilized: hydrogen, alkyl, alkoxy, phenoxy, alkenyl, alkynyl, phenylalkyl, 
hydroxyalkyl, haloalkyl, aryl, arylalkyl, alkyloxy, alkylthio, alkenylthio, phenyl, 
phenylalkyl, phenylalkylthio, hydroxyalkyl-thio, alkylthiocarbbamylthio, cyclohexyl, 
pyridyl, piperidinyl, alkylamino, amino, nitro, mercapto, cyano, hydroxyl, a halogen atom, 
halomethyl, an oxygen atom (e.g., forming a ketone or N-oxide) or a sulphur atom (e.g., 
forming a thiol, thione, di-alkylsulfoxide or sulfone) are all examples of moieties that can 
be utilized. 

[0293] Additional examples of structures or sub-structures that may be utilized are an 
aryl optionally substituted with one, two, or three substituents independently selected from 
the group consisting of alkyl, alkoxy, halogen, trihalomethyl, carboxylate, carboxamide, 
nitro, and ester moieties; an amine of formula -NX 2 X 3 , where X 2 and X 3 are independently 
selected from the group consisting of hydrogen, saturated or unsaturated alkyl, and 
homocyclic or heterocyclic ring moieties; halogen or trihalomethyl; a ketone of formula - 
COX4, where X4 is selected from the group consisting of alkyl and homocyclic or 
heterocyclic ring moieties; a carboxylic acid of formula -(X 5 ) n COOH or ester of formula 
(X6) n COOX 7 , where X 5 , X6, and X 7 and are independently selected from the group 
consisting of alkyl and homocyclic or heterocyclic ring moieties and where n is 0 or 1 ; an 
alcohol of formula (X 8 ) n OH or an alkoxy moiety of formula -(X 8 ) n OX 9 , where X 8 and X 9 
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are independently selected from the group consisting of saturated or unsaturated alkyl and 
homocyclic or heterocyclic ring moieties, wherein said ring is optionally substituted with 
one or more substituents independently selected from the group consisting of alkyl, 
alkoxy, halogen, trihalomethyl, carboxylate, nitro, and ester and where n is 0 or 1 ; an 
amide of formula NHCOXio, where Xi 0 is selected from the group consisting of alkyl, 
hydroxyl, and homocyclic or heterocyclic ring moieties, wherein said ring is optionally 
substituted with one or more substituents independently selected from the group consisting 
of alkyl, alkoxy, halogen, trihalomethyl, carboxylate, nitro, and ester; SO2, NX n Xi 2 , 
where Xu and X12 are selected from the group consisting of hydrogen, alkyl, and 
homocyclic or heterocyclic ring moieties; a homocyclic or heterocyclic ring moiety 
optionally substituted with one, two, or three substituents independently selected from the 
group consisting of alkyl, alkoxy, halogen, trihalomethyl, carboxylate, carboxamide, nitro, 
and ester moieties; an aldehyde of formula -CHO; a sulfone of formula -SO2X13, where 
Xn is selected from the group consisting of saturated or unsaturated alkyl and homocyclic 
or heterocyclic ring moieties; and a nitro of formula -NO2. 

Identification of Attachment Sites on Molecular Scaffolds and Ligands 

[0294] In addition to the identification and development of ligands for kinases and other 
enzymes, determination of the orientation of a molecular scaffold or other binding 
compound in a binding site allows identification of energetically allowed sites for 
attachment of the binding molecule to another component. For such sites, any free energy 
change associated with the presence of the attached component should not destablize the 
binding of the compound to the kinase to an extent that will disrupt the binding. 
Preferably, the binding energy with the attachment should be at least 4 kcal/mol., more 
preferably at least 6, 8, 10, 12, 15, or 20 kcal/mol. Preferably, the presence of the 
attachment at the particular site reduces binding energy by no more than 3, 4, 5, 8, 10, 12, 
or 15 kcal/mol. 

[0295] In many cases, suitable attachment sites will be those that are exposed to solvent 
when the binding compound is bound in the binding site. In some cases, attachment sites 
can be used that will result in small displacements of a portion of the enzyme without an 
excessive energetic cost. Exposed sites can be identified in various ways. For example, 
exposed sites can be identified using a graphic display or 3 -dimensional model. In a 
grahic display, such as a computer display, an image of a compound bound in a binding 
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site can be visually inspected to reveal atoms or groups on the compound that are exposed 
to solvent and oriented such that attachment at such atom or group would not preclude 
binding of the enzyme and binding compound. Energetic costs of attachment can be 
calculated based on changes or distortions that would be caused by the attachment as well 
as entropic changes. 

[0296] Many different types of components can be attached. Persons with skill are 
familiar with the chemistries used for various attachments. Examples of components that 
can be attached include, without limitation: solid phase components such as beads, plates, 
chips, and wells; a direct or indirect label; a linker, which may be a traceless linker; among 
others. Such linkers can themselves be attached to other components, e.g., to solid phase 
media, labels, and/or binding moieties. 

[0297] The binding energy of a compound and the effects on binding energy for 
attaching the molecule to another component can be calculated approximately using any of 
a variety of available software or by manual calculation. An example is the following: 

[0298] Calculations were performed to estimate binding energies of different organic 
molecules to two Kinases: PIM-1 and CDK2. The organic molecules considered included 
Staurosporine, identified compounds that bind to PIM-1, and several linkers. 

[0299] Calculated binding energies between protein-ligand complexes were obtained 
using the FlexX score (an implementation of the Bohm scoring function) within the Tripos 
software suite. The form for that equation is shown in Eqn. 1 below: 

AGbind = AGtr + AGhb + AGion + AGlipo + AGarom + AGrot 

[0300] where: AGtr is a constant term that accounts for the overall loss of rotational and 
translational entropy of the lignand, AGhb accounts for hydrogen bonds formed between 
the ligand and protein, AGion accounts for the ionic interactions between the ligand and 
protein, AGlipo accounts for the lipophilic interaction that corresponds to the protein- 
ligand contact surface, AGarom accounts for interactions between aromatic rings in the 
protein and ligand, and AGrot accounts for the entropic penalty of restricting rotatable 
bonds in the ligand upon binding. 
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[0301] This method estimates the free energy that a lead compound should have to a 
target protein for which there is a crystal structure, and it accounts for the entropic penalty 
of flexible linkers. It can therefore be used to estimate the free energy penalty incurred by 
attaching linkers to molecules being screened and the binding energy that a lead 
compound should have in order to overcome the free energy penalty of the linker. The 
method does not account for solvation and the entropic penalty is likely overestimated for 
cases where the linker is bound to a solid phase through another binding complex, such as 
a biotimstreptavidin complex. 

[0302] Co-crystals were aligned by superimposing residues of PIM-1 with 
corresponding residues in CDK2. The PIM-1 structure used for these calculations was a 
co-crystal of PYK2 with a binding compound. The CDK2:Staurosporine co-crystal used 
was from the Brookhaven database file laql. Hydrogen atoms were added to the proteins 
and atomic charges were assigned using the AMBER95 parameters within Sybyl. 
Modifications to the compounds described were made within the Sybyl modeling suite 
from Tripos. 

[0303] These calcualtions indicate that the calculated binding energy for compounds that 
bind strongly to a given target (such as Staurosporine:CDK2) can be lower than -25 
kcal/mol, while the calculated binding affinity for a good scaffold or an unoptimized 
binding compound can be in the range of -15 to -20. The free energy penalty for 
attachment to a linker such as the ethylene glycol or hexatriene is estimated as typically 
being in the range of +5 to +15 kcal/mol. 

Linkers 

[0304] Linkers suitable for use in the invention can be of many different types. Linkers 
can be selected for particular applications based on factors such as linker chemistry 
compatible for attachment to a binding compound and to another component utilized in the 
particular application. Additional factors can include, without limitation, linker length, 
linker stability, and ability to remove the linker at an appropriate time. Exemplary linkers 
include, but are not limited to, hexyl, hexatrienyl, ethylene glycol, and peptide linkers. 
Traceless linkers can also be used, e.g., as described in Plunkett, M. J., and Ellman, J. A., 
(1995), J. Org. Chem., 60:6006. 
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[0305] Typical functional groups, that are utilized to link binding compound(s), include, 
but not limited to, carboxylic acid, amine, hydroxyl, and thiol. (Examples can be found in 
Solid-supported combinatorial and parallel synthesis of small molecular weight compound 
libraries; (1998) Tetrahedron organic chemistry series Vol.17; Pergamon; p85). 

Labels 

[0306] As indicated above, labels can also be attached to a binding compound or to a 
linker attached to a binding compound. Such attachment may be direct (attached directly 
to the binding compound) or indirect (attached to a component that is directly or indirectly 
attached to the binding compound). Such labels allow detection of the compound either 
directly or indirectly. Attachement of labels can be performed using conventional 
chemistries. Labels can include, for example, fluorescent labels, radiolabels, light 
scattering particles, light absorbent particles, magnetic particles, enzymes, and specific 
binding agents (e.g., biotin or an antibody target moiety). 

Solid Phase Media 

[0307] Additional examples of components that can be attached directly or indirectly to 
a binding compound include various solid phase media. Similar to attachment of linkers 
and labels, attachment to solid phase media can be performed using conventional 
chemistries. Such solid phase media can include, for example, small components such as 
beads, nanoparticles, and fibers (e.g., in suspension or in a gel or chromatographic matrix). 
Likewise, solid phase media can include larger objects such as plates, chips, slides, and 
tubes. In many cases, the binding compound will be attached in only a portion of such an 
objects, e.g., in a spot or other local element on a generally flat surface or in a well or 
portion of a well. 

Identification of Biological Agents 

[0308] The posession of structural information about a protein also provides for the 
identification of useful biological agents, such as epitpose for development of antibodies, 
identification of mutation sites expected to affect activity, and identification of attachment 
sites allowing attachment of the protein to materials such as labels, linkers, peptides, and 
solid phase media. 
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[0309] Antibodies (Abs) finds multiple applications in a variety of areas including 
biotechnology, medicine and diagnosis, and indeed they are one of the most powerful tools 
for life science research. Abs directed against protein antigens can recognize either linear 
or native three-dimensional (3D) epitopes. The obtention of Abs that recognize 3D 
epitopes require the use of whole native protein (or of a portion that assumes a native 
conformation) as immunogens. Unfortunately, this not always a choice due to various 
technical reasons: for example the native protein is just not available, the protein is toxic, 
or its is desirable to utilize a high density antigen presentation. In such cases, 
immunization with peptides is the alternative. Of course, Abs generated in this manner 
will recognize linear epitopes, and they might or might not recognize the source native 
protein, but yet they will be useful for standard laboratory applications such as western 
blots. The selection of peptides to use as immunogens can be accomplished by following 
particular selection rules and/or use of epitope prediction software. 

[0310] Though methods to predict antigenic peptides are not infallible, there are several 
rules that can be followed to determine what peptide fragments from a protein are likely to 
be antigenic. These rules are also dictated to increase the likelihood that an Ab to a 
particular peptide will recognize the native protein. 

• 1 . Antigenic peptides should be located in solvent accessible regions and contain 
both hydrophobic and hydrophilic residues. 

o For proteins of known 3D structure, solvent accessibility can be determined 
using a variety of programs such as DSSP, NACESS, or WHATIF, among 
others. 

o If the 3D structure is not known, use any of the following web servers to 
predict accessibilities: PHD , JPRED. PredAcc (c) ACCpro 

• 2. Preferably select peptides lying in long loops connecting Secondary Structure 
(SS) motifs, avoiding peptides located in helical regions. This will increase the 
odds that the Ab recognizes the native protein. Such peptides can, for example, be 
identified from a crystal structure or crystal structure-based homology model. 
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o For protein with known 3D coordinates, SS can be obtained from the 
sequence link of the relevant entry at the Brookhaven data bank . The 
PDBsum server also offer SS analysis of pdb records. 

o When no structure is available secondary structure predictions can be 
obtained from any of the following servers: PHD. JPRED. PSI-PRED . 
NNSP, etc 

• 3. When possible, choose peptides that are in the N- and C-terminal region of the 
protein. Because the N- and C- terminal regions of proteins are usually solvent 
accessible and unstructured, Abs against those regions are also likely to recognize 
the native protein. 

• 4. For cell surface glycoproteins, eliminate from initial peptides those containing 
consesus sites for N-glycosilation. 

o N-glycosilation sites can be detected using Scanprosite . or NetNGlvc 

[0311] In addition, several methods based on various physio-chemical properties of 
experimental determined epitopes (flexibility, hydrophibility, accessibility) have been 
published for the prediction of antigenic determinants and can be used. The antigenic 
index and Preditop are example. 

[0312] Perhaps the simplest method for the prediction of antigenic determinants is that 
of Kolaskar and Tongaonkar, which is based on the occurrence of amino acid residues in 
experimentally determined epitopes. (Kolaskar and Tongaonkar (1990) A semi-empirical 
method for prediction of antigenic determinants on protein antigens. FEBBS Lett. 276(1- 
2): 172-1 74.) The prediction algorithm works as follows: 

• 1. Calculate the average propensity for each overlapping 7-mer and assign the 
result to the central residue (i+3) of the 7-mer. 

• 2. Calculate the average for the whole protein. 

• 3. (a) If the average for the whole protein is above 1 .0 then all residues having 
average propensity above 1.0 are potentially antigenic. 
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• 3. (b) If the average for the whole protein is below 1 .0 then all residues having 
above the average for the whole protein are potentially antigenic. 

• 4. Find 8-mers where all residues are selected by step 3 above (6-mers in the 
original paper) 

[0313] The Kolaskar and Tongaonkar method is also available from the GCG package, 
and it runs using the command egcg. 

[0314] Crystal structures also allow identification of residues at which mutation is likely 
to alter the activity of the protein. Such residues include, for example, residues that 
interact with susbtrate, conserved active site residues, and residues that are in a region of 
ordered secondary structure of involved in tertiary interactions. The mutations that are 
likely to affect activity will vary for different molecular contexts. Mutations in an active 
site that will affect activity are typically substitutions or deletions that eliminate a charge- 
charge or hydrogen bonding interaction, or introduce a steric interference. Mutations in 
secondary structure regions or molecular interaction regions that are likely to affect 
activity include, for example, substitutions that alter the hydrophobicity/hydrophilicity of a 
region, or that introduce a sufficient strain in a region near or including the active site so 
that critical residue(s) in the active site are displaced. Such substitutions and/or deletions 
and/or insertions are recognized, and the predicted structural and/or energetic effects of 
mutations can be calculated using conventional software. 

IX. Kinase Activity Assays 

[0315] A number of different assays for kinase activity can be utilized for assaying for 
active modulators and/or determining specificity of a modulator for a particular kinase or 
group or kinases. In addition to the assays mentioned below, one of ordinary skill in the art 
will know of other assays that can be utilized and can modify an assay for a particular 
application. 

[0316] An exemplary assay for kinase activity that can be used for PYK2 can be 
performed according to the following procedure using purified kinase using myelin basic 
protein (MBP) as substrate. An exemplary assay can use the following materials: MBP 
(M-1891, Sigma); Kinase buffer (KB = HEPES 50 mM, pH7.2, MgCl 2 :MnCl 2 (200 
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^iM:200 ^iM); ATP (y- J3 P):NEG602H (10 mCi/mL)(Perkin-Elmer); ATP as 100 mM stock 
in kinase buffer; EDTA as 1 00 mM stock solution. 

[0317] Coat scintillation plate suitable for radioactivity counting (e.g., FlashPlate from 
Perkin-Elmer, such as the SMP200(basic)) with kinase+MBP mix (final 100 ng+300 
ng/well) at 90 -|aL/well in kinase buffer. Add compounds at 1 jaL/well from 10 mM stock 
in DMSO. Positive control wells are added with 1 \ih of DMSO. Negative control wells 
are added with 2 jjL of EDTA stock solution. ATP solution (10 jaL) is added to each well 
to provide a final concentration of cold ATP is 2 jaM, and 50 nCi ATPy[ 33 P]. The plate is 
shaken briefly, and a count is taken to initiate count (IC) using an apparatus adapted for 
counting with the plate selected, e.g., Perkin-Elmer Trilux. Store the plate at 37°C for 4 
hrs, then count again to provide final count (FC). 

[0318] Net 33P incorporation (NI) is calculated as: NI = FC - IC. 

[0319] The effect of the present of a test compound can then be calculated as the percent 
of the positive control as: %PC = [(NI - NC) / (PC - NC)] x 100, where NC is the net 
incorporation for the negative control, and PC is the net incorporation for the positive 
control. 

[0320] As indicated above, other assays can also be readily used. For example, kinase 
activity can be measured on standard polystyrene plates, using biotinylated MBP and 
ATPy[ 33 P] and with Streptavidin-coated SPA (scintillation proximity) beads providing the 
signal. 

[0321] Additional alternative assays can employ phospho-specific antibodies as 
detection reagents with biotinylated peptides as substrates for the kinase. This sort of 
assay can be formatted either in a fluorescence resonance energy transfer (FRET) format, 
or using an AlphaScreen (amplified /uminescent proximity /?omogeneous assay) format by 
varying the donor and acceptor reagents that are attached to streptavidin or the phosphor- 
specific antibody. 

X. Organic Synthetic Techniques 

[0322] The versatility of computer-based modulator design and identification lies in the 
diversity of structures screened by the computer programs. The computer programs can 
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search databases that contain very large numbers of molecules and can modify modulators 
already complexed with the enzyme with a wide variety of chemical functional groups. A 
consequence of this chemical diversity is that a potential modulator of kinase function may 
take a chemical form that is not predictable. A wide array of organic synthetic techniques 
exist in the art to meet the challenge of constructing these potential modulators. Many of 
these organic synthetic methods are described in detail in standard reference sources 
utilized by those skilled in the art. One example of suh a reference is March, 1994, 
Advanced Organic Chemistry; Reactions, Mechanisms and Structure . New York, McGraw 
Hill. Thus, the techniques useful to synthesize a potential modulator of kinase function 
identified by computer-based methods are readily available to those skilled in the art of 
organic chemical synthesis. 

XI. Administration 

[0323] The methods and compounds will typically be used in therapy for human 
patients. However, they may also be used to treat similar or identical diseases in other 
vertebrates such as other primates, sports animals, and pets such as horses, dogs and cats. 

[0324] Suitable dosage forms, in part, depend upon the use or the route of 
administration, for example, oral, transdermal, transmucosal, or by injection (parenteral). 
Such dosage forms should allow the compound to reach target cells. Other factors are 
well known in the art, and include considerations such as toxicity and dosage forms that 
retard the compound or composition from exerting its effects. Techniques and 
formulations generally may be found in Remington's Pharmaceutical Sciences, 18 th ed., 
Mack Publishing Co., Easton, PA, 1990 (hereby incorporated by reference herein). 

[0325] Compounds can be formulated as pharmaceutically acceptable salts. 
Pharmaceutically acceptable salts are non-toxic salts in the amounts and concentrations at 
which they are administered. The preparation of such salts can facilitate the 
pharmacological use by altering the physical characteristics of a compound without 
preventing it from exerting its physiological effect. Useful alterations in physical 
properties include lowering the melting point to facilitate transmucosal administration and 
increasing the solubility to facilitate administering higher concentrations of the drug. 
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[0326] Pharmaceutically acceptable salts include acid addition salts such as those 
containing sulfate, chloride, hydrochloride, fiimarate, maleate, phosphate, sulfamate, 
acetate, citrate, lactate, tartrate, methanesulfonate, ethanesulfonate, benzenesulfonate, p- 
toluenesulfonate, cyclohexylsulfamate and quinate. Pharmaceutically acceptable salts can 
be obtained from acids such as hydrochloric acid, maleic acid, sulfuric acid, phosphoric 
acid, sulfamic acid, acetic acid, citric acid, lactic acid, tartaric acid, malonic acid, 
methanesulfonic acid, ethanesulfonic acid, benzenesulfonic acid, /?-toluenesulfonic acid, 
cyclohexylsulfamic acid, fumaric acid, and quinic acid. 

[0327] Pharmaceutically acceptable salts also include basic addition salts such as those 
containing benzathine, chloroprocaine, choline, diethanolamine, ethylenediamine, 
meglumine, procaine, aluminum, calcium, lithium, magnesium, potassium, sodium, 
ammonium, alkylamine, and zinc, when acidic functional groups, such as carboxylic acid 
or phenol are present. For example, see Remington's Pharmaceutical Sciences . 19 th ed., 
Mack Publishing Co., Easton, PA, Vol. 2, p. 1457, 1995. Such salts can be prepared using 
the appropriate corresponding bases. 

[0328] Pharmaceutically acceptable salts can be prepared by standard techniques. For 
example, the free-base form of a compound is dissolved in a suitable solvent, such as an 
aqueous or aqueous-alcohol in solution containing the appropriate acid and then isolated 
by evaporating the solution. In another example, a salt is prepared by reacting the free 
base and acid in an organic solvent. 

[0329] The pharmaceutically acceptable salt of the different compounds may be present 
as a complex. Examples of complexes include 8-chlorotheophylline complex (analogous 
to, e.g., dimenhydrinate: diphenhydramine 8-chlorotheophylline (1:1) complex; 
Dramamine) and various cyclodextrin inclusion complexes. 

[0330] Carriers or excipients can be used to produce pharmaceutical compositions. The 
carriers or excipients can be chosen to facilitate administration of the compound. 
Examples of carriers include calcium carbonate, calcium phosphate, various sugars such as 
lactose, glucose, or sucrose, or types of starch, cellulose derivatives, gelatin, vegetable 
oils, polyethylene glycols and physiologically compatible solvents. Examples of 
physiologically compatible solvents include sterile solutions of water for injection (WFI), 
saline solution, and dextrose. 
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[0331] The compounds can be administered by different routes including intravenous, 
intraperitoneal, subcutaneous, intramuscular, oral, transmucosal, rectal, or transdermal. 
Oral administration is preferred. For oral administration, for example, the compounds can 
be formulated into conventional oral dosage forms such as capsules, tablets, and liquid 
preparations such as syrups, elixirs, and concentrated drops. 

[0332] Pharmaceutical preparations for oral use can be obtained, for example, by 
combining the active compounds with solid excipients, optionally grinding a resulting 
mixture, and processing the mixture of granules, after adding suitable auxiliaries, if 
desired, to obtain tablets or dragee cores. Suitable excipients are, in particular, fillers such 
as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose preparations, for 
example, maize starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, 
methyl cellulose, hydroxypropylmethyl-cellulose, sodium carboxymethylcellulose (CMC), 
and/or polyvinylpyrrolidone (PVP: povidone). If desired, disintegrating agents may be 
added, such as the cross — linked polyvinylpyrrolidone, agar, or alginic acid, or a salt 
thereof such as sodium alginate. 

[0333] Dragee cores are provided with suitable coatings. For this purpose, concentrated 
sugar solutions may be used, which may optionally contain, for example, gum arabic, talc, 
polyvinylpyrrolidone, carbopol gel, polyethylene glycol (PEG), and/or titanium dioxide, 
lacquer solutions, and suitable organic solvents or solvent mixtures. Dye-stuffs or 
pigments may be added to the tablets or dragee coatings for identification or to 
characterize different combinations of active compound doses. 

[0334] Pharmaceutical preparations that can be used orally include push- fit capsules 
made of gelatin ("gelcaps"), as well as soft, sealed capsules made of gelatin, and a 
plasticizer, such as glycerol or sorbitol. The push-fit capsules can contain the active 
ingredients in admixture with filler such as lactose, binders such as starches, and/or 
lubricants such as talc or magnesium stearate and, optionally, stabilizers. In soft capsules, 
the active compounds may be dissolved or suspended in suitable liquids, such as fatty oils, 
liquid paraffin, or liquid polyethylene glycols (PEGs). In addition, stabilizers may be 
added. 

[0335] Alternatively, injection (parenteral administration) may be used, e.g., 
intramuscular, intravenous, intraperitoneal, and/orsubcutaneous. For injection, the 
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compounds of the invention are formulated in sterile liquid solutions, preferably in 
physiologically compatible buffers or solutions, such as saline solution, Hank's solution, or 
Ringer's solution. In addition, the compounds may be formulated in solid form and 
redissolved or suspended immediately prior to use. Lyophilized forms can also be 
produced. 

[0336] Administration can also be by transmucosal or transdermal means. For 
transmucosal or transdermal administration, penetrants appropriate to the barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art, 
and include, for example, for transmucosal administration, bile salts and fusidic acid 
derivatives. In addition, detergents may be used to facilitate permeation. Transmucosal 
administration, for example, may be through nasal sprays or suppositories (rectal or 
vaginal). 

[0337] The amounts of various compound to be administered can be determined by 
standard procedures taking into account factors such as the compound IC 5 o, the biological 
half-life of the compound, the age, size, and weight of the patient, and the disorder 
associated with the patient. The importance of these and other factors are well known to 
those of ordinary skill in the art. Generally, a dose will be between about 0.01 and 50 
mg/kg, preferably 0. 1 and 20 mg/kg of the patient being treated. Multiple doses may be 
used. 

Manipulation of PYK2 

[0338] As the full-length coding sequence and amino acid sequence of PYK2 is known, 
cloning, construction of recombinant hPIM-3, production and purification of recombinant 
protein, introduction of PYK2 into other organisms, and other molecular biological 
manipulations of PYK2 are readily performed. 

[0339] Techniques for the manipulation of nucleic acids, such as, e.g., subcloning, 
labeling probes (e.g., random-primer labeling using Klenow polymerase, nick translation, 
amplification), sequencing, hybridization and the like are well disclosed in the scientific 
and patent literature, see, e.g., Sambrook, ed., Molecular Cloning: a Laboratory Manual 
(2nd ed.), Vols. 1-3, Cold Spring Harbor Laboratory, (1989); Current Protocols in 
Molecular Biology, Ausubel, ed. John Wiley & Sons, Inc., New York (1997); Laboratory 
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Techniques in Biochemistry and Molecular Biology: Hybridization With Nucleic Acid 
Probes, Part I. Theory and Nucleic Acid Preparation, Tijssen, ed. Elsevier, N.Y. (1993). 
[0100] Nucleic acid sequences can be amplified as necessary for further use using 
amplification methods, such as PCR, isothermal methods, rolling circle methods, etc., are 
well known to the skilled artisan. See, e.g., Saiki, "Amplification of Genomic DNA" in 
PCR Protocols, Innis et al., Eds., Academic Press, San Diego, CA 1990, pp 13-20; 
Wharam et al., Nucleic Acids Res. 2001 Jun 1 ;29(1 1):E54-E54; Hafher et al., 
Biotechniques 2001 Apr;30(4):852-6, 858, 860 passim; Zhong et al., Biotechniques 2001 
Apr;30(4):852-6, 858, 860 passim. 

[0340] Nucleic acids, vectors, capsids, polypeptides, and the like can be analyzed and 
quantified by any of a number of general means well known to those of skill in the art. 
These include, e.g., analytical biochemical methods such as NMR, spectrophotometry, 
radiography, electrophoresis, capillary electrophoresis, high performance liquid 
chromatography (HPLC), thin layer chromatography (TLC), and hyperdiffusion 
chromatography, various immunological methods, e.g. fluid or gel precipitin reactions, 
immunodiffusion, immuno-electrophoresis, radioimmunoassays (RIAs), enzyme-linked 
immunosorbent assays (ELISAs), immuno-fluorescent assays, Southern analysis, Northern 
analysis, dot-blot analysis, gel electrophoresis (e.g., SDS-PAGE), nucleic acid or target or 
signal amplification methods, radiolabeling, scintillation counting, and affinity 
chromatography. 

[0341] Obtaining and manipulating nucleic acids used to practice the methods of the 
invention can be performed by cloning from genomic samples, and, if desired, screening 
and re-cloning inserts isolated or amplified from, e.g., genomic clones or cDNA clones. 
Sources of nucleic acid used in the methods of the invention include genomic or cDNA 
libraries contained in, e.g., mammalian artificial chromosomes (MACs), see, e.g., U.S. 
Patent Nos. 5,721,1 18; 6,025,155; human artificial chromosomes, see, e.g., Rosenfeld 
(1997) Nat. Genet. 15:333-335; yeast artificial chromosomes (YAC); bacterial artificial 
chromosomes (BAC); PI artificial chromosomes, see, e.g., Woon (1998) Genomics 
50:306-316; PI -derived vectors (PACs), see, e.g., Kern (1997) Biotechniques 23:120-124; 
cosmids, recombinant viruses, phages or plasmids. Typically, nucleic acid molecules 
having a sequence of interest are available from commercial sources and/or from sequence 
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repositories, or can be obtained using PCR from a suitable cDNA or genomic library, e.g., 
a library from an appropriate tissue. A number of different such libraries are 
commercially or publicly available. 

[0342] The nucleic acids can be operatively linked to a promoter. A promoter can be 
one motif or an array of nucleic acid control sequences which direct transcription of a 
nucleic acid. A promoter can include necessary nucleic acid sequences near the start site 
of transcription, such as, in the case of a polymerase II type promoter, a TATA element. 
A promoter also optionally includes distal enhancer or repressor elements which can be 
located as much as several thousand base pairs from the start site of transcription. A 
"constitutive" promoter is a promoter which is active under most environmental and 
developmental conditions. An "inducible" promoter is a promoter which is under 
environmental or developmental regulation. A "tissue specific" promoter is active in 
certain tissue types of an organism, but not in other tissue types from the same organism. 
The term "operably linked" refers to a functional linkage between a nucleic acid 
expression control sequence (such as a promoter, or array of transcription factor binding 
sites) and a second nucleic acid sequence, wherein the expression control sequence directs 
transcription of the nucleic acid corresponding to the second sequence. 

[0343] The nucleic acids of the invention can also be provided in expression vectors and 
cloning vehicles, e.g., sequences encoding the polypeptides of the invention. Expression 
vectors and cloning vehicles of the invention can comprise viral particles, baculovirus, 
phage, plasmids, phagemids, cosmids, fosmids, bacterial artificial chromosomes, viral 
DNA (e.g., vaccinia, adenovirus, foul pox virus, pseudorabies and derivatives of SV40), 
PI -based artificial chromosomes, yeast plasmids, yeast artificial chromosomes, and any 
other vectors specific for specific hosts of interest (such as bacillus, Aspergillus and yeast). 
Vectors of the invention can include chromosomal, non-chromosomal and synthetic DNA 
sequences. Large numbers of suitable vectors are known to those of skill in the art, and 
are commercially available. 

[0344] The nucleic acids of the invention can be cloned, if desired, into any of a variety 
of vectors using routine molecular biological methods; methods for cloning in vitro 
amplified nucleic acids are disclosed, e.g., U.S. Pat. No. 5,426,039. To facilitate cloning 
of amplified sequences, restriction enzyme sites can be "built into" a PCR primer pair. 



89 



Atty. Dkt. No.: 039363-1202 



Vectors may be introduced into a genome or into the cytoplasm or a nucleus of a cell and 
expressed by a variety of conventional techniques, well described in the scientific and 
patent literature. See, e.g., Roberts (1987) Nature 328:731; Schneider (1995) Protein 
Expr. Purif 6435:10; Sambrook, Tijssen or Ausubel. The vectors can be isolated from 
natural sources, obtained from such sources as ATCC or GenBank libraries, or prepared 
by synthetic or recombinant methods. For example, the nucleic acids of the invention can 
be expressed in expression cassettes, vectors or viruses which are stably or transiently 
expressed in cells (e.g., episomal expression systems). Selection markers can be 
incorporated into expression cassettes and vectors to confer a selectable phenotype on 
transformed cells and sequences. For example, selection markers can code for episomal 
maintenance and replication such that integration into the host genome is not required. 

[0345] The nucleic acids can be administered in vivo for in situ expression of the 

peptides or polypeptides of the invention. The nucleic acids can be administered as 

"naked DNA" (see, e.g., U.S. Patent No. 5,580,859) or in the form of an expression vector, 

e.g., a recombinant virus. The nucleic acids can be administered by any route, including 

peri- or intra-tumorally, as described below. Vectors administered in vivo can be derived 

from viral genomes, including recombinantly modified enveloped or non-enveloped DNA 

arid RNA viruses, preferably selected from baculoviridiae, parvoviridiae, picornoviridiae, 

herpesveridiae, poxviridae, adenoviridiae, or picornnaviridiae. Chimeric vectors may also 

be employed which exploit advantageous merits of each of the parent vector properties 

(See e.g., Feng (1997) Nature Biotechnology 15:866-870). Such viral genomes may be 

modified by recombinant DNA techniques to include the nucleic acids of the invention; 

and may be further engineered to be replication deficient, conditionally replicating or 

replication competent. In alternative aspects, vectors are derived from the adenoviral (e.g., 

replication incompetent vectors derived from the human adenovirus genome, see, e.g., 

U.S. Patent Nos. 6,096,718; 6,110,458; 6,113,913; 5,631,236); adeno-associated viral and 

retroviral genomes. Retroviral vectors can include those based upon murine leukemia 

virus (MuLV), gibbon ape leukemia virus (GaLV), Simian Immuno deficiency virus 

(SIV), human immuno deficiency virus (HIV), and combinations thereof; see, e.g., U.S. 

Patent Nos. 6,1 17,681; 6,107,478; 5,658,775; 5,449,614; Buchscher (1992) J. Virol. 

66:2731-2739; Johann (1992) J. Virol. 66:1635-1640). Adeno-associated virus (AAV)- 

based vectors can be used to transduce cells with target nucleic acids, e.g., in the in vitro 

production of nucleic acids and peptides, and in in vivo and ex vivo gene therapy 
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procedures; see, e.g., U.S. Patent Nos. 6,110,456; 5,474,935; Okada (1996) Gene Ther. 
3:957-964. 

[0346] The present invention also relates to fusion proteins, and nucleic acids encoding 
them. A polypeptide of the invention can be fused to a heterologous peptide or 
polypeptide, such as N-terminal identification peptides which impart desired 
characteristics, such as increased stability or simplified purification. Peptides and 
polypeptides of the invention can also be synthesized and expressed as fusion proteins 
with one or more additional domains linked thereto for, e.g., producing a more 
immunogenic peptide, to more readily isolate a recombinantly synthesized peptide, to 
identify and isolate antibodies and antibody-expressing B cells, and the like. Detection 
and purification facilitating domains include, e.g., metal chelating peptides such as 
polyhistidine tracts and histidine-tryptophan modules that allow purification on 
immobilized metals, protein A domains that allow purification on immobilized 
immunoglobulin, and the domain utilized in the FLAGS extension/affinity purification 
system (Immunex Corp, Seattle WA). The inclusion of a cleavable linker sequences such 
as Factor Xa or enterokinase (Invitrogen, San Diego CA) between a purification domain 
and the motif-comprising peptide or polypeptide to facilitate purification. For example, an 
expression vector can include an epitope-encoding nucleic acid sequence linked to six 
histidine residues followed by a thioredoxin and an enterokinase cleavage site (see e.g., 
Williams (1995) Biochemistry 34:1787-1797; Dobeli (1998) Protein Expr. Purif 12:404- 
414). The histidine residues facilitate detection and purification while the enterokinase 
cleavage site provides a means for purifying the epitope from the remainder of the fusion 
protein. In one aspect, a nucleic acid encoding a polypeptide of the invention is assembled 
in appropriate phase with a leader sequence capable of directing secretion of the translated 
polypeptide or fragment thereof. Technology pertaining to vectors encoding fusion 
proteins and application of fusion proteins are well disclosed in the scientific and patent 
literature, see e.g., Kroll (1993) DNA Cell Biol 12:441-53. 

[0347] The nucleic acids and polypeptides of the invention can be bound to a solid 
support, e.g., for use in screening and diagnostic methods. Solid supports can include, 
e.g., membranes (e.g., nitrocellulose or nylon), a microtiter dish (e.g., PVC, 
polypropylene, or polystyrene), a test tube (glass or plastic), a dip stick (e.g., glass, PVC, 
polypropylene, polystyrene, latex and the like), a microfuge tube, or a glass, silica, plastic, 
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metallic or polymer bead or other substrate such as paper. One solid support uses a metal 
(e.g., cobalt or nickel)-comprising column which binds with specificity to a histidine tag 
engineered onto a peptide. 

[0348] Adhesion of molecules to a solid support can be direct (i.e., the molecule contacts 
the solid support) or indirect (a "linker" is bound to the support and the molecule of 
interest binds to this linker). Molecules can be immobilized either covalently (e.g., 
utilizing single reactive thiol groups of cysteine residues (see, e.g., Colliuod (1993) 
Bioconjugate Chem. 4:528-536) or non-covalently but specifically (e.g., via immobilized 
antibodies (see, e.g., Schuhmann (1991) Adv. Mater. 3:388-391 ; Lu (1995) Anal Chem. 
67:83-87; the biotin/strepavidin system (see, e.g., Iwane (1997) Biophys. Biochem. Res. 
Comm. 230:76-80); metal chelating, e.g., Langmuir-Blodgett films (see, e.g., Ng (1995) 
Langmuir 1 1:4048-55); metal-chelating self-assembled monolayers (see, e.g., Sigal (1996) 
Anal Chem. 68:490-497) for binding of polyhistidine fusions. 

[0349] Indirect binding can be achieved using a variety of linkers which are 
commercially available. The reactive ends can be any of a variety of functionalities 
including, but not limited to: amino reacting ends such as N-hydroxysuccinimide (NHS) 
active esters, imidoesters, aldehydes, epoxides, sulfonyl halides, isocyanate, 
isothiocyanate, and nitroaryl halides; and thiol reacting ends such as pyridyl disulfides, 
maleimides, thiophthalimides, and active halogens. The heterobifunctional crosslinking 
reagents have two different reactive ends, e.g., an amino-reactive end and a thiol-reactive 
end, while homobifunctional reagents have two similar reactive ends, e.g., 
bismaleimidohexane (BMH) which permits the cross-linking of sulfhydryl-containing 
compounds. The spacer can be of varying length and be aliphatic or aromatic. Examples 
of commercially available homobifunctional cross-linking reagents include, but are not 
limited to, the imidoesters such as dimethyl adipimidate dihydrochloride (DMA); 
dimethyl pimelimidate dihydrochloride (DMP); and dimethyl suberimidate 
dihydrochloride (DMS). Heterobifunctional reagents include commercially available 
active halogen-NHS active esters coupling agents such as N-succinimidyl bromoacetate 
and N-succinimidyl (4-iodoacetyl)aminobenzoate (SIAB) and the sulfosuccinimidyl 
derivatives such as sulfosuccinimidyl(4-iodoacetyl)aminobenzoate (sulfo-SIAB) (Pierce). 
Another group of coupling agents is the heterobifunctional and thiol cleavable agents 
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such as N-succinimidyl 3-(2-pyridyidithio)propionate (SPDP) (Pierce Chemicals, 
Rockford, IL). 

[0350] Antibodies can also be used for binding polypeptides and peptides of the 
invention to a solid support. This can be done directly by binding peptide-specific 
antibodies to the column or it can be done by creating fusion protein chimeras comprising 
motif-containing peptides linked to, e.g., a known epitope (e.g., a tag (e.g., FLAG, myc) or 
an appropriate immunoglobulin constant domain sequence (an "immunoadhesin," see, 
e.g., Capon (1989) Nature 377:525-531 (1989). 

[0351] Nucleic acids or polypeptides of the invention can be immobilized to or applied 
to an array. Arrays can be used to screen for or monitor libraries of compositions (e.g., 
small molecules, antibodies, nucleic acids, etc.) for their ability to bind to or modulate the 
activity of a nucleic acid or a polypeptide of the invention. For example, in one aspect of 
the invention, a monitored parameter is transcript expression of a gene comprising a 
nucleic acid of the invention. One or more, or, all the transcripts of a cell can be measured 
by hybridization of a sample comprising transcripts of the cell, or, nucleic acids 
representative of or complementary to transcripts of a cell, by hybridization to 
immobilized nucleic acids on an array, or "biochip." By using an "array" of nucleic acids 
on a microchip, some or all of the transcripts of a cell can be simultaneously quantified. 
Alternatively, arrays comprising genomic nucleic acid can also be used to determine the 
genotype of a newly engineered strain made by the methods of the invention. Polypeptide 
arrays" can also be used to simultaneously quantify a plurality of proteins. 

[0352] The terms "array" or "microarray" or "biochip" or "chip" as used herein is a 
plurality of target elements, each target element comprising a defined amount of one or 
more polypeptides (including antibodies) or nucleic acids immobilized onto a defined area 
of a substrate surface. In practicing the methods of the invention, any known array and/or 
method of making and using arrays can be incorporated in whole or in part, or variations 
thereof, as disclosed, for example, in U.S. Patent Nos. 6,277,628; 6,277,489; 6,261,776; 
6,258,606; 6,054,270; 6,048,695; 6,045,996; 6,022,963; 6,013,440; 5,965,452; 5,959,098; 
5,856,174; 5,830,645; 5,770,456; 5,632,957; 5,556,752; 5,143,854; 5,807,522; 5,800,992; 
5,744,305; 5,700,637; 5,556,752; 5,434,049; see also, e.g., WO 99/51773; WO 99/09217; 
WO 97/46313; WO 96/17958; see also, e.g., Johnston (1998) Curr. Biol 8:R171-R174; 
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Schummer (1997) Biotechniques 23:1087-1092; Kern (1997) Biotechniques 23:120-124; 
Solinas-Toldo (1997) Genes, Chromosomes & Cancer 20:399-407; Bowtell (1999) Nature 
Genetics Supp. 21 :25-32. See also published U.S. patent applications Nos. 20010018642; 
20010019827; 20010016322; 20010014449; 20010014448; 20010012537; 20010008765. 

Host Cells and Transformed Cells 

[0353] The invention also provides a transformed cell comprising a nucleic acid 
sequence of the invention, e.g., a sequence encoding a polypeptide of the invention, or a 
vector of the invention. The host cell may be any of the host cells familiar to those skilled 
in the art, including prokaryotic cells, eukaryotic cells, such as bacterial cells, fungal cells, 
yeast cells, mammalian cells, insect cells, or plant cells. Exemplary bacterial cells include 
E. coli, Streptomyces 9 Bacillus subtilis, Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, and Staphylococcus. Exemplary insect 
cells include Drosophila S2 and Spodoptera Sf9. Exemplary animal cells include CHO, 
COS or Bowes melanoma or any mouse or human cell line. The selection of an 
appropriate host is within the abilities of those skilled in the art. 

[0354] Vectors may be introduced into the host cells using any of a variety of 
techniques, including transformation, transfection, transduction, viral infection, gene guns, 
or Ti-mediated gene transfer. Particular methods include calcium phosphate transfection, 
DEAE-Dextran mediated transfection, lipofection, or electroporation. 

[0355] Engineered host cells can be cultured in conventional nutrient media modified as 
appropriate for activating promoters, selecting transformants or amplifying the genes of 
the invention. Following transformation of a suitable host strain and growth of the host 
strain to an appropriate cell density, the selected promoter may be induced by appropriate 
means (e.g., temperature shift or chemical induction) and the cells may be cultured for an 
additional period to allow them to produce the desired polypeptide or fragment thereof. 

[0356] Cells can be harvested by centrifugation, disrupted by physical or chemical 
means, and the resulting crude extract is retained for further purification. Microbial cells 
employed for expression of proteins can be disrupted by any convenient method, including 
freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents. Such 
methods are well known to those skilled in the art. The expressed polypeptide or fragment 
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can be recovered and purified from recombinant cell cultures by methods including 
ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange 
chromatography, phosphocellulose chromatography, hydrophobic interaction 
chromatography, affinity chromatography, hydroxylapatite chromatography and lectin 
chromatography. Protein refolding steps can be used, as necessary, in completing 
configuration of the polypeptide. If desired, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. 

[0357] Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 
lines of monkey kidney fibroblasts and other cell lines capable of expressing proteins from 
a compatible vector, such as the CI 27, 3T3, CHO, HeLa and BHK cell lines. 

[0358] The constructs in host cells can be used in a conventional manner to produce the 
gene product encoded by the recombinant sequence. Depending upon the host employed 
in a recombinant production procedure, the polypeptides produced by host cells containing 
the vector may be glycosylated or may be non-glycosylated. Polypeptides of the invention 
may or may not also include an initial methionine amino acid residue. 

[0359] Cell-free translation systems can also be employed to produce a polypeptide of 
the invention. Cell-free translation systems can use mRNAs transcribed from a DNA 
construct comprising a promoter operably linked to a nucleic acid encoding the 
polypeptide or fragment thereof. In some aspects, the DNA construct may be linearized 
prior to conducting an in vitro transcription reaction. The transcribed mRNA is then 
incubated with an appropriate cell-free translation extract, such as a rabbit reticulocyte 
extract, to produce the desired polypeptide or fragment thereof. 

[0360] The expression vectors can contain one or more selectable marker genes to 
provide a phenotypic trait for selection of transformed host cells such as dihydrofolate 
reductase or neomycin resistance for eukaryotic cell culture, or such as tetracycline or 
ampicillin resistance in E. coli. 

[0361] For transient expression in mammalian cells, cDNA encoding a polypeptide of 
interest may be incorporated into a mammalian expression vector, e.g. pcDNAl, which is 
available commercially from Invitrogen Corporation (San Diego, Calif, U.S.A.; catalogue 
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number V490-20). This is a multifunctional 4.2 kb plasmid vector designed for cDNA 
expression in eukaryotic systems, and cDNA analysis in prokaryotes, incorporated on the 
vector are the CMV promoter and enhancer, splice segment and polyadenylation signal, an 
SV40 and Polyoma virus origin of replication, and Ml 3 origin to rescue single strand 
DNA for sequencing and mutagenesis, Sp6 and T7 RNA promoters for the production of 
sense and anti-sense RNA transcripts and a Col El -like high copy plasmid origin. A 
polylinker is located appropriately downstream of the CMV promoter (and 3' of the T7 
promoter). 

[0362] The cDNA insert may be first released from the above phagemid incorporated at 
appropriate restriction sites in the pcDNAI polylinker. Sequencing across the junctions 
may be performed to confirm proper insert orientation in pcDNAI. The resulting plasmid 
may then be introduced for transient expression into a selected mammalian cell host, for 
example, the monkey-derived, fibroblast like cells of the COS-1 lineage (available from 
the American Type Culture Collection, Rockville, Md. as ATCC CRL 1650). 

[0363] For transient expression of the protein-encoding DNA, for example, COS-1 cells 
may be transfected with approximately 8 |xg DNA per 10 6 COS cells, by DEAE-mediated 
DNA transfection and treated with chloroquine according to the procedures described by 
Sambrook et al, Molecular Cloning: A Laboratory Manual, 1989, Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor N.Y, pp. 16.30-16.37. An exemplary method is as 
follows. Briefly, COS-1 cells are plated at a density of 5 x 10 6 cells/dish and then grown 
for 24 hours in FBS-supplemented DMEM/F12 medium. Medium is then removed and 
cells are washed in PBS and then in medium. A transfection solution containing DEAE 
dextran (0.4 mg/ml), 100 chloroquine, 10% NuSerum, DNA (0.4 mg/ml) in 
DMEM/F12 medium is then applied on the cells 10 ml volume. After incubation for 3 
hours at 37 °C, cells are washed in PBS and medium as just described and then shocked 
for 1 minute with 10% DMSO in DMEM/F12 medium. Cells are allowed to grow for 2-3 
days in 10% FBS-supplemented medium, and at the end of incubation dishes are placed on 
ice, washed with ice cold PBS and then removed by scraping. Cells are then harvested by 
centrifugation at 1000 rpm for 10 minutes and the cellular pellet is frozen in liquid 
nitrogen, for subsequent use in protein expression. Northern blot analysis of a thawed 
aliquot of frozen cells may be used to confirm expression of receptor-encoding cDNA in 
cells under storage. 
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[0364] In a like manner, stably transfected cell lines can also prepared, for example, 
using two different cell types as host: CHO Kl and CHO Pro5. To construct these cell 
lines, cDNA coding for the relevant protein may be incorporated into the mammalian 
expression vector pRC/CMV (Invitrogen), which enables stable expression. Insertion at 
this site places the cDNA under the expression control of the cytomegalovirus promoter 
and upstream of the polyadenylation site and terminator of the bovine growth hormone 
gene, and into a vector background comprising the neomycin resistance gene (driven by 
the SV40 early promoter) as selectable marker. 

[0365] An exemplary protocol to introduce plasmids constructed as described above is 
as follows. The host CHO cells are first seeded at a density of 5x1 0 5 in 10% FBS- 
supplemented MEM medium. After growth for 24 hours, fresh medium is added to the 
plates and three hours later, the cells are transfected using the calcium phosphate-DNA co- 
precipitation procedure (Sambrook et al, supra). Briefly, 3 jig of DNA is mixed and 
incubated with buffered calcium solution for 10 minutes at room temperature. An equal 
volume of buffered phosphate solution is added and the suspension is incubated for 15 
minutes at room temperature. Next, the incubated suspension is applied to the cells for 4 
hours, removed and cells were shocked with medium containing 1 5% glycerol. Three 
minutes later, cells are washed with medium and incubated for 24 hours at normal growth 
conditions. Cells resistant to neomycin are selected in 10% FBS -supplemented alpha- 
MEM medium containing G418 (1 mg/ml). Individual colonies of G4 1 8-resistant cells are 
isolated about 2-3 weeks later, clonally selected and then propagated for assay purposes. 



EXAMPLES 

A number of examples involved in the present invention are described below. In most 
cases, alternative techniques could also be used. For example, techniques, methods, and 
other information described in U.S. Patent 5,837,815; U.S. Patent 5,837,524; U.S. Patent 
Publication 2002/0048782; PCT/US98/02797, WO 98/35056; and McShan et al, Internat. 
J. Oncology 21 : 197-205 (2002) can be used in the present invention. Such techniques and 
information include, without limitation, cloning, culturing, purification, assaying, 
screening, use of modulators, sequence information, and information concerning 
biological role of PYK2. Each of these references is incorporated by reference herein in 
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its entirety, including drawings. 

EXAMPLE 1 : Cloning of PYK2 Kinase Domain 

[0366] Kinase domain of PYK2 (amino acids 420 - 691) was amplified by polymerase 
chain reaction (PCR) using the specific primers 5'- 

TCCACAGCATATGATTGCCCGTGAAGA TGTGGT-3' (SEQ ID NO: 5) and 5'- 
CTCTCGTCGACCTACATGGCAATGTCCTTCTCCA-3' (SEQ ID NO: 6). The 
resulting PCR fragment was digested with Ndel and Sail and was ligated into a modified 
pET15b vector (Novagen) with a cleavable N-terminal hexa-histidine tag (designated 
pETl S). PYK2 coding sequence has been deposited with GenBank under accession 
number U33284. A desired PYK2 sequence can be obtained using PCR with a brain (e.g., 
human brain) cDNA library, such as obtaining kinase domain using the above primers in 
PCR. The multi-cloning site of the pETISS vector is shown in the following sequence 
(SEQ ID NO: 7), including the sequence encoding the N-terminal hexa-histadine tag: 

T7 promoter 

AGATCT CGATCCCGCGAAAT TAATACGACTCACTATA GGGGAATTGTGAGCGGATAACAATTCCCC 

RBS 

TCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATACC 



Ndel 

ATGGGCAGCAGCCATCATCATCATCATCACAGCAGCGGCCTGGTGCCGCGCGGCAGCCATATGGGATCCGG 
MGSSHHHHHHSSGLVPRGSHM 



StuI Sail 

AATTCAAAGGCCTACGTCGACTAGAGCCTGCAGTCTCGACCATCATCATCATCATCATTAATAAAAG' 



Spel BamHI 

IGGCCGTTACTAGTGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGG 



IVEX-3 Primer 



B pull02 I T7 terminator 

I ' I ' ' ^^^BMi^WU^M 1 I TGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTG 
3' -PET Primer 



[0367] pETl 5S vector is derived from pETl 5b vector (Novagen) for bacterial 
expression to produce the proteins with N-terminal His6. This vector was modified by 
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replacement of Ndel-BamHI fragment to others to create Sail site and stop codon (TAG). 
Vector size is 5814 bp. Insert can be put using Ndel-Sall site. 

[0368] 

[0369] The amino acid and nucleic acid sequences for the PYK2 kinase domain utilized 
are provided in Table 4 (SEQ ED NO: 1 and 3 respectively). 

EXAMPLE 2: Expression and Purification of PYK2 Kinase Domain 

[0370] For protein expression Pyk2 kinase domain was transformed into E. coli strain 
BL21 (DE3) pLysS and transformants were selected on LB plates containing Kanamycin. 
Single colonies were grown overnight at 37°C in 200ml TB (terrific broth) media. 16xlL 
of fresh TB media in 2.8L flasks were inoculated with 10ml of overnight culture and 
grown with constant shaking at 37°C. Once cultures reached an absorbance of 1.0 at 
600nm, ImM isopropyl-p-D-thiogalactopyranoside (IPTG) was added and cultures were 
allowed to grow for a further 12hrs at 22°C with constant shaking. Cells were harvested by 
centrifugation at 7000 x g and pellets were frozen in liquid nitrogen and stored at -80°C 
until ready for lysis. 

[0371] The cell pellet was suspended in lysis buffer containing 0. 1M Potassium 
phosphate buffer pH 8.0, 200mM NaCl, 10%Glycerol, 2mm PMSF and EDTA free 
protease inhibitor cocktail tablets (Roche). Cells were lysed using a microfuidizer 
processor (Microfuidics Corporation) and insoluble cellular debris was removed using 
centrifugation at 30,000 x g. The cleared supernatant was added to Talon resin (Clonetech) 
and incubated for 4hrs at 4°C with constant rocking. The suspension was loaded onto a 
column and washed with 20 column volumes of lysis buffer plus lOmM Imadazole. 
Protein was eluted step wise with addition of lysis buffer plus 200mM Imadazole pH7.5 
and 1ml fractions collected. Fractions containing PYK2 were pooled, concentrated and 
loaded onto a Pharmacia HiLoad 26/60 Superdex 200 sizing column (Pharmacia) pre- 
equilibrated with 20mM Tris pH7.5, 150mM NaCl. 

[0372] Peak fractions were collected and assayed by SDS-PAGE. Fractions containing 
PYK2 were pooled and diluted in Tris buffer pH 7.5, until 30mM NaCl was reached. 
Diluted protein was further subjected to anion exchange chromatography using a Source 
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15Q (Pharmacia) sepharose column equilibrated with 20mM Tris pH7.5. Elution was 
performed using a linear gradient of sodium chloride (0-500mM). Eluted protein was 
treated with 2U thrombin per mg protein to remove N-terminal Histidine tag. Following 
cleavage Pyk2 was re-applied to Source 15Q (Pharmacia) sepharose column equilibrated 
with 20mM Tris pH7.5, and eluted using a linear sodium chloride gradient. Purified 
protein was concentrated to lOOmg/ml and stored at -80°C until ready for crystallization 
screening. 

Example 3: Crystallization of PYK2 Kinase Domain 

[0373J Crystallization conditions were initially identified in the Hampton Research 
(Riverside, CA) screening kit ( i). Optimized crystals were grown by vapor diffusion in 
sitting drop plates with equal volumes of protein solution of 10 mg/ml containing 20mM 
Tris-HCl pH 8.0, 150mM NaCl, 14mM BME, ImM DTT and reservoir solution 
containing 8% polyethylene glycol (PEG) 8000, 0.2M Sodium Acetate, 0.1M Cacodylate 
pH 6.5, 20% Glycerol). Blades of crystals grew overnight at 4°C. Microseeding was used 
to produce larger, single crystals, the largest crystal being around 0.3mm X 0.05mm X 
0.02mm. 

Example 4: Diffraction Analysis of PYK2 

[0374] Synchrotron X-ray data for Pyk2 was collected at beamline 8.3.1 of the 
Advanced Light Source (ALS, Lawrence Berkeley National Laboratory, Berkeley) on a 
Quantum 210 charge-coupled device detector (X = 1.1 OA). The mother liquor from the 
reservoir was used as cryo-protectant for the crystal. Detector distance was 1 10mm and 
exposure time was 10s per frame. 200 frames were collected with 0.5° oscillation over a 
wedge of 100°. The quality and resolution limits of the diffraction pattern were 
considerably improved by annealing the crystal. The crystal was briefly allowed to warm 
up for 10 seconds by shutting off the Nitrogen cryo stream and refrozen by resuming 
cooling with the cryo stream. Crystals of PYK2 diffracted to a resolution limit of 1.45 A 
with cell dimensions of a = 37A, b = 47A, c = 81 A, a = 90°, p = 92°, y = 90°. The data 
were processed using Mosflm 0 and scaled and reduced with Scala 0 in CCP4 0 in space 
group P2. The data processing process was driven by the ELVES automation scripts (J. M. 
Holton, unpublished data). An inspection of the 0K0 zone indicated that all odd (2n+l) 
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reflections were very weak compared with the even reflections, suggesting the space group 
tobeP2i. 

PYK2 Structure Determination and Refinement 

[0375] The initial phases for the dataset were obtained by molecular replacement. A 
homology model of the protein Pyk2 was generated using the LCK kinase structure 
(PDBID: lqpc) as a template. This model was trimmed by excising all loops before being 
used in molecular replacement program EPMR 0, which resulted in a solution with 
CC=0.372. The molecular replacement solution phases were improved by the program 
Arp-Warp (). The resultant model was further improved by manual model building and 
extension in O 0 and refinement with CNX 0 and RefmacS () in CCP4. The cycle of 
model building and refinement continued till the model was complete and refinement 
converged to the R/Rfree of 20.83/26.94 %. The geometric analysis of the model was 
performed by PROCHECK 0 which indicated the structure to have excellent geometry. 

[0376] Data collection and refinement statistics for PYK2 kinase domain crystal, and for 
PYK2 kinase domain/binding compound cocrystal are summarized in the following table: 



Data Collection and Refinement Statistics 





Pyk2 (APO) 


Pyk2+AMPPNP 


Crystal Parameters 


Space Group 


P2, 


P2, 


Unit Cell (A) 


a=37.17, b=46.97, 
c=80.36, □ =92.63 


a=37.32, b=46.98, c=81.1 1, 
□ =92.83 


Number of molecules/AU 


1 


1 


V M (A 3 /Dalton) 


2.4 


2.4 


Solvent content (%) 


48 


48 








Data Collection and Processing 


Resolution (A) 


1.45 


1.80 


Wavelength (A) 


1.1 


1.1 


Unique reflections 


47843 


26149 
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Redundancy (last shell*) 


2.0(1.8) 


4.0 (2.9) 


Completeness (last shell) 
(%) 


97.5 (88.9) 


99.8 (97.8) 


VD (last shell) 


10.9(1.3) 


12.0(2.3) 


Rsym (last shell) 


0.043 (0.487) 


0.063 (0.459) 


*Last shell (A) 


1.49-1.45 


1.85-1.80 


Refinement 


R W ork/Rfree(%) 


16.93/20.68 


18.62/22.81 


Number of Atoms 


2583 


2507 


Rmsd from ideal 
geometry 


0.012 (bond distance), 
1 .434 (bond angle) 


0.010 (bond distance), 1.372 
(bond angle) 


SigmaA coordinate error 


0.16 A (for 5.0-1.45 A) 


0.14 A (for 5.0-1.80 A) 


Average B-factors (A 2 ) 


19.3 


20.5 


Protein atoms 


16.4 


19.0 


Waters 


37.6 


34.3 


Ligand 




44.41 



[0377] The model of Pyk2 contains 273 amino acids (spanning the PYK2 sequence 420- 
691 with one residue from the cloning vector) and 180 water molecules. The Pyk2 
structure adopts the standard kinase fold consisting of an N-terminal P-sheet domain and a 
C-terminal a-helical domain linked by a 5 residue linker. The linker segment contains the 
canonical H-bond acceptor/donor residues E503 and Y505 that would normally interact 
with the adenosine ring of ATP. In the apo structure these residues make H-bonds with 
water molecules. 

[0378] A ribbon diagram of the PYK2 active site is shown in Figure 1 . Atomic 
coordinates for the apo protein are provided in Table 1, while atomic coordinates for a 
PYK2 co-crystallized with a binding compound (AMPPNP) are provided in Table 2. 

Active Loop Conformation 

[0379] In many protein kinases, the activation loop, or A-loop, plays an important role in 

regulating the kinase activity. In active kinases, the A-loops adopt a highly similar 
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conformation characterized by the formation of three small P-sheet moieties: two with the 
main body of the protein (the beginning of the catalytic or C-loop and the aEF/aF loop, 
respectively), and one with the substrate peptide. In contrast, the inactive conformation of 
A-loop differs markedly from protein to protein, albeit having the similar effect of 
blocking ATP binding, substrate-binding, or both. In comparison with the active insulin 
receptor (INSR) and IGFR1 kinase domain strutures, the A-loop in the solved Pyk2 
structure is clearly in an inactive conformation. The loop is stabilized by a unique set of 
intra- and inter-loop interactions that differentiate it from all known A-loop structures. 

[0380] The A-loop in our Pyk2 structure starts to deviate from the standard active 
conformation at the DFG motif (for comparison, we modeled the active A-loop 
conformation of Pyk2 based on the IGFR1 structure). The first two residues of the DFG 
motif (D 567 and F 568 ) have similar orientations as their counterparts in the active A-loop 
form, with D 567 interacting with K 457 (03) and F 568 locked in a hydrophobic pocket 
sandwiched by two residues (I 477 and M 478 ) from aC. However, the third residue in the 
motif, G 569 , adopts a completely different conformation, resulting in the formation of a 
hydrogen bond beween G 567 :NH and H 547 :CO. This hydrogen bond forces the A-loop to a 
different path that precludes it from forming a p-sheet with C-loop. A similar hydrogen 
bond has also been observed in two other tyrosine kinases: HCK (lqcf) and SRC (lftnk). 

[0381] There are multiple interactions that help to stabilize the A-loop in its observed 
conformation. Most of them involve a unique sequence moiety of Pyk2. Among the 
tyrosine kinases of known structure, Pyk2 contains a unique ED repeat (E 575 -D 578 ) in the 
A-loop. In the Pyk2 structure, E 575 is exposed to solvent, whereas D 576 initiates a tight p- 
turn. Beside providing the canonical (3-turn backbone hydrogen bond between D 576 :CO- 
Y 579 :NH, the side chain of D 576 also interacts with D 578 :NH. The 0-turn region of A-loop is 
held to the aEF/aF loop by two side-chain-backbone hydrogen bonds: one between 
E 577 :CO-R 600 :N e and the other between K 581 :NZ-N 598 :CO. The side chain of E 577 interacts 
with the end of the activation loop via two hydrogen bonds, one with T 585 (OG) and the 
other with R 586 (NH). The most interesting feature of the Pyk2 A-loop is the salt bridge 
formed between D 588 and R 547 from the C-loop (the distances between the two OD and two 
NH atoms are 2.9A). Neither of the two tyrosines Y 579 and Y 580 is phosphorylated in our 
structure. Y 579 is exposed to solvent, whereas Y 580 binds to the hydrophobic portions of the 
E 575 and E 577 side chains. 
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[0382] Because FAK does not have the second ED, the conformation of the A-loop in an 
inactive FAK is expected to be different. 

Implications for substrate binding and autophosphorylation 

[0383] An important event in the enzymatic activation of FAK/Pyk2 is the 
autophosphorylation of a tyrosine residue before the catalytic domain (Y402). The 
phosphorylated Y402 provides the binding site for Src and other related kinases and 
facilitates Src-dependent phosphorylation of other tyrosine residues on Pyk2 including 
Y579 and Y580. It is not clear how autophosphorylation could occur before Y579 and 
Y580 are phosphorylated. 

[0384] To test whether Y402 can reach the substrate binding site, we modeled the 7 
residue peptide D 400 IYAEIPD 407 containing Y 402 into the substrate binding site based on 
the cocrystal structure of IGFR1 kinase domain with its substrate peptide. In our protein 
construct, the Pyk2 insert starts at 1420. There are four residues (GSHM) N-terminal to 
1420 left by the His-tag used, of those only M419 is visible. We then modeled the 1 1 
residues that link D419 to M407. The model shows that, in order to reach the substrate 
binding site, the N-terminal region has to transverse along the back of aC. The link would 
also fix the A-loop in the active conformation. This may provide the mechanism that the 
protein used to autophosphorylate Y402. Once Y402 is phosphorylated, the N-terminus is 
then released and subject to SH2 binding. The A-loop also becomes flexible and 
accessible to Src. 

[0385] Because the residues surrounding the P+l and P+3 binding pocket are mostly 
hydrophobic in tyrosine kinases, substrate P+l and P+3 sites are mostly hydrophobic 
residues. The residue that might interact with P+2 varies. Acidic and other polar site 
chains might be preferred because of the nearby residue R586. The P-l site is an acidic 
residue in INSR and IGFR1 . The residue for interacting with P-l is Arg; this residue is 
changed to Gly in Pyk2, leaving the space largely hydrophobic. The autophosphorylation 
site sequence in Pyk2, IYAEEPD, and the sequences of several other known Pyk2 
phosphorylation sites fit well the substrate selectivity profile of Pyk2. 
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Example 5: PYK2 Binding Assays 

[0386] Binding assays can be performed in a variety of ways, including a variety of 
ways known in the art. For example, competitive binding to PYK2 can be measured on 
Nickel-FlashPlates, using His-tagged PYK2 (~ 100 ng) and ATPy[ 35 S] (~ 10 nCi). As 
compound is added, the signal decreases, since less ATPy[ 35 S] is bound to PYK2 which is 
proximal to the scintillant in the FlashPlate. The binding assay can be performed by the 
addition of compound (10 \xl; 20 mM) to PYK2 protein or kinase domain (90 10 jil) 
followed by the addition of ATPy[ 35 S] and incubating for 1 hr at 37°C. The radioactivity 
is measured through scintillation counting in Trilus (Perkin-Elmer). 

[0387] Alternatively, any method which can measure binding of a ligand to the ATP- 
binding site can be used. For example, a fluorescent ligand can be used. When bound to 
PYK2, the emitted fluorescence is polarized. Once displaced by inhibitor binding, the 
polarization decreases. 

[0388] Determination of IC50 for compounds by competitive binding assays. (Note that 
Ki is the dissociation constant for inhibitor binding; K D is the dissociation constant for 
substrate binding.) For this system, the IC50, inhibitor binding constant and substrate 
binding constant can be interrelated according to the following formula: 

[0389] When using radiolabeled substrate Ki = IC50 

1+ [L*]/K D 

[0390] the IC50 ~ Ki when there is a small amount of labeled substrate. 
Example 6: PYK2 Activity Assay 

[0391] As an exemplary kinase assay, the kinase activity of PYK2 was measured in 
AlphaScreening (Packard Bioscience). The kinase buffer (HMNB) contains HEPES 
50mM at pH7.2, Mg/Mn 5mM each, NP-40 0.1%, and BSA at final 50ug/ml. 
AlphaScreening is conducted as described by the manufacturer. In brief, the kinase 
reaction is performed in 384-well plate in 25ul volume. The substrate is biotin-(E4Y)3 at 
final concentration of InM. The final concentration of ATP is lOuM. For compound 
testing the final DMSO concentration is 1%. The reaction is incubated in 31 °C for 1 hour. 
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[0392] The Pyk2 kinase domain residues 419 to 691 is an active kinase in AlphaScreen. 
At a concentration of 8ng/well in 384-well plate, PYK2 shows a Kd of 7.34uM, which is 
in general agreement with most protein kinases (Table 5). Inhibition by ATP analogs was 
tested with Pyk2 at 8ng/well and ATP at lOuM. The data is shown in Table 5. The affinity 
of ATP-g-S and ADP with Pyk2 is at 14uM. Adenosine and AMP-PCP have little effect 
on PYK2 in the concentration tested. 



Example 9: Synthesis of the Compounds of Formula I: 

Scheme -1 
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[0393] The triazole derivatives, represented by Formula I, can be prepared as shown in 
Scheme- 1. 

Step-1 Preparation of formula (3) 

[0394] The compound of formula (3) is prepared conventionally by reaction of a 
compound of formula (1), where Ri = alkyl, aryl, heteroaryl (e.g. /w-toluic hydrazide), with 
an isothiocyanate of formula (2), in a basic solvent (e.g. pyridine), typically heated near 65 
°C for 2-6 hours. 

Step-2 Preparation of formula (5) 
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[0395] The compound of formula (5) is prepared conventionally by reaction of a 
compound of formula (3) with an alkylating agent of formula (4)(e.g. methyl iodide), in an 
inert solvent (e.g. THF) at room temperature for 24 -48 hours. 

Step- 3 Preparation of Formula I 

[0396] The compound of Formula I is prepared by dissolving a compound of formula (5) 
in POC13 and heated near 80 °C for 8 -12 hours. When the reaction is substantially 
complete, the product of Formula I is isolated by conventional means (e.g. reverse phase 
HPLC). Smith, et. aL, J. Comb. Chem., 1999, 7, 368-370; and references therein. 

Example 10: Site-directed Mutagenesis of PYK2 kinase 

[0397] Mutagenesis of PYK2 kinase can be carried out according to the following 
procedure as described in Molecular Biology: Current Innovations and Future Trends. Eds. 
A.M. Griffin and H.G.Griffin. (1995) ISBN 1-898486-01-8, Horizon Scientific Press, PO 
Box 1, Wymondham, Norfolk, U.K., among others. 

[0398] In vitro site-directed mutagenesis is an invaluable technique for studying protein 
structure-function relationships, gene expression and vector modification. Several methods 
have appeared in the literature, but many of these methods require single-stranded DNA as 
the template. The reason for this, historically, has been the need for separating the 
complementary strands to prevent reannealing. Use of PCR in site-directed mutagenesis 
accomplishes strand separation by using a denaturing step to separate the complementing 
strands and allowing efficient polymerization of the PCR primers. PCR site-directed 
methods thus allow site-specific mutations to be incorporated in virtually any double- 
stranded plasmid; eliminating the need for M13-based vectors or single-stranded rescue. 

[0399] It is often desirable to reduce the number of cycles during PCR when performing 
PCR-based site-directed mutagenesis to prevent clonal expansion of any (undesired) 
second-site mutations. Limited cycling which would result in reduced product yield, is 
offset by increasing the starting template concentration. A selection is used to reduce the 
number of parental molecules coming through the reaction. Also, in order to use a single 
PCR primer set, it is desirable to optimize the long PCR method. Further, because of the 
extendase activity of some thermostable polymerases it is often necessary to incorporate 
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an end-polishing step into the procedure prior to end-to-end ligation of the PCR-generated 
product containing the incorporated mutations in one or both PCR primers. 

[0400] The following protocol provides a facile method for site-directed mutagenesis 
and accomplishes the above desired features by the incorporation of the following steps: 
(i) increasing template concentration approximately 1000-fold over conventional PCR 
conditions; (ii) reducing the number of cycles from 25-30 to 5-10; (iii) adding the 
restriction endonuclease Dpnl (recognition target sequence: 5-Gm6ATC-3, where the A 
residue is methylated) to select against parental DNA (note: DNA isolated from almost all 
common strains of E. coli is Dam-methylated at the sequence 5-GATC-3); (iv) using Taq 
Extender in the PCR mix for increased reliability for PCR to 10 kb; (v) using Pfu DNA 
polymerase to polish the ends of the PCR product, and (vi) efficient intramolecular 
ligation in the presence of T4 DNA ligase. 

[0401] Plasmid template DNA (approximately 0.5 pmole) is added to a PCR cocktail 
containing, in 25 ul of lx mutagenesis buffer: (20 mM Tris HC1, pH 7.5; 8 mM MgC12; 40 
ug/ml BSA); 12-20 pmole of each primer (one of which must contain a 5-prime 
phosphate), 250 uM each dNTP, 2.5 U Taq DNA polymerase, 2.5 U of Taq Extender 
(Stratagene). 

[0402] The PCR cycling parameters are 1 cycle of: 4 min at 94 C, 2 min at 50 C and 2 
min at 72 C; followed by 5-10 cycles of 1 min at 94 C, 2 min at 54 C and 1 min at 72 C 
(step 1). 

[0403] The parental template DNA and the linear, mutagenesis-primer incorporating 
newly synthesized DNA are treated with Dpnl (10 U) and Pfu DNA polymerase (2.5U). 
This results in the Dpnl digestion of the in vivo methylated parental template and hybrid 
DNA and the removal, by Pfu DNA polymerase, of the Taq DNA polymerase-extended 
base(s) on the linear PCR product. 

[0404] The reaction is incubated at 37 C for 30 min and then transferred to 72 C for an 
additional 30 min (step 2). 

[0405] Mutagenesis buffer (lx, 115 ul, containing 0.5 mM ATP) is added to the Dpnl- 
digested, Pfu DNA polymerase-polished PCR products. 
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[0406] The solution is mixed and 10 ul is removed to a new microfuge tube and T4 
DNA ligase (2-4 U) added. 

[0407] The ligation is incubated for greater than 60 min at 37 C (step 3). 

[0408] The treated solution is transformed into competent E. coli (step 4). 

[0409] In addition to the PCT-based site-directed mutagenesis described above, other 
methods are available. Examples include those described in Kunkel (1985) Proc. Natl. 
Acad. Sci. 82:488-492; Eckstein et al. (1985) Nucl. Acids Res. 13:8764-8785; and using 
the GeneEditor™ Site-Directed Mutageneis Sytem from Promega. 

[0410] All patents and other references cited in the specification are indicative of the 
level of skill of those skilled in the art to which the invention pertains, and are 
incorporated by reference in their entireties, including any tables and figures, to the same 
extent as if each reference had been incorporated by reference in its entirety individually. 

[0411] One skilled in the art would readily appreciate that the present invention is well 
adapted to obtain the ends and advantages mentioned, as well as those inherent therein. 
The methods, variances, and compositions described herein as presently representative of 
preferred embodiments are exemplary and are not intended as limitations on the scope of 
the invention. Changes therein and other uses will occur to those skilled in the art, which 
are encompassed within the spirit of the invention, are defined by the scope of the claims. 

[0412] It will be readily apparent to one skilled in the art that varying substitutions and 
modifications may be made to the invention disclosed herein without departing from the 
scope and spirit of the invention. For example, variations can be made to crystallization or 
co-crystallization conditions for P YK2 proteins and/or various kinase domain sequences 
can be used. Thus, such additional embodiments are within the scope of the present 
invention and the following claims. 

[0413] The invention illustratively described herein suitably may be practiced in the 
absence of any element or elements, limitation or limitations which is not specifically 
disclosed herein. Thus, for example, in each instance herein any of the terms 



109 



Atty. Dkt. No.: 039363-1202 



"comprising", "consisting essentially of and "consisting of may be replaced with either 
of the other two terms. The terms and expressions which have been employed are used as 
terms of description and not of limitation, and there is no intention that in the use of such 
terms and expressions of excluding any equivalents of the features shown and described or 
portions thereof, but it is recognized that various modifications are possible within the 
scope of the invention claimed. Thus, it should be understood that although the present 
invention has been specifically disclosed by preferred embodiments and optional features, 
modification and variation of the concepts herein disclosed may be resorted to by those 
skilled in the art, and that such modifications and variations are considered to be within 
the scope of this invention as defined by the appended claims. 

[0414] In addition, where features or aspects of the invention are described in terms of 
Markush groups or other grouping of alternatives, those skilled in the art will recognize 
that the invention is also thereby described in terms of any individual member or subgroup 
of members of the Markush group or other group. 

[0415] Also, unless indicated to the contrary, where various numerical values are 
provided for embodiments, additional embodiments are described by taking any 2 
different values as the endpoints of a range. Such ranges are also within the scope of the 
described invention. 

[0416] Thus, additional embodiments are within the scope of the invention and within 
the following claims. 
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REFINEMENT. 
PROGRAM 
AUTHORS 



REFMAC 5.1.25 
MURSHUDOV, VAGIN, DODSON 



REFINEMENT TARGET : MAXIMUM LIKELIHOOD 



DATA USED IN REFINEMENT. 
RESOLUTION RANGE HIGH (ANGSTROMS) 
RESOLUTION RANGE LOW (ANGSTROMS) 
DATA CUTOFF (SIGMA(F)) 
COMPLETENESS FOR RANGE (%) 
NUMBER OF REFLECTIONS 

FIT TO DATA USED IN REFINEMENT. 
CROSS-VALIDATION METHOD 
FREE R VALUE TEST SET SELECTION 
R VALUE (WORKING + TEST SET) 

R VALUE (WORKING SET) 

FREE R VALUE 

FREE R VALUE TEST SET SIZE (%) 
FREE R VALUE TEST SET COUNT 



.45 
.06 



1. 
79. 
NONE 
97.02 
45396 



THROUGHOUT 

RANDOM 

0. 17122 
0. 16934 
0.20676 
5.0 



2407 



FIT IN THE HIGHEST RESOLUTION BIN. 
TOTAL NUMBER OF BINS USED 
BIN RESOLUTION RANGE HIGH 
BIN RESOLUTION RANGE LOW 
REFLECTION IN BIN (WORKING SET) 

BIN R VALUE (WORKING SET) 

BIN FREE R VALUE SET COUNT 
BIN FREE R VALUE 



NUMBER OF NON-HYDROGEN ATOMS USED IN REFINEMENT. 
ALL ATOMS : 2 583 





20 


1 


.450 


1 


.488 




3077 


0 


.283 




151 


0 


.287 



B VALUES. 
FROM WILSON PLOT (A**2) 
MEAN B VALUE (OVERALL, A**2) 

OVERALL ANISOTROPIC B VALUE. 



NULL 
15.129 



Bll 
B22 
B33 
B12 
B13 
B2 3 



(A**2) 
(A**2) 
(A**2) 
(A**2) 
(A**2) 
(A**2) 



45 
51 
07 
00 
23 



0.00 



ESTIMATED OVERALL COORDINATE ERROR. 
ESU BASED ON R VALUE 
ESU BASED ON FREE R VALUE 
ESU BASED ON MAXIMUM LIKELIHOOD 



(A) 
(A) 
(A) 



ESU FOR B VALUES BASED ON MAXIMUM LIKELIHOOD (A**2) 



0.083 
0.073 
0.046 
1.218 



CORRELATION COEFFICIENTS. 

CORRELATION COEFFICIENT FO-FC : 
CORRELATION COEFFICIENT FO-FC FREE : 

RMS DEVIATIONS FROM IDEAL VALUES 
BOND LENGTHS REFINED ATOMS (A) 
BOND LENGTHS OTHERS (A) 
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0. 966 
0.949 

COUNT 
2278 
2095 



RMS 
0.012 
0.002 



WEIGHT 

0.022 

0.020 
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SCALE 1 
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ATOM 
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BOND ANGLES REFINED ATOMS 


(DEGREES) 


3079 




1 


434 , 


1 


970 


BOND ANGLES OTHERS 


(DEGREES) 


4880 




1 


216 , 


3 


000 


TORSION ANGLES , PERIOD 1 


(DEGREES) 


271 




5 


456 , 


5 


000 


CH I RAL- CENTER RESTRAINTS 


(A**3) 


339 




0 


083 , 


0 


200 


GENERAL PLANES REFINED ATOMS 


(A) 
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0 


009 , 
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020 


GENERAL PLANES OTHERS 


(A) 


462 
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011 , 
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020 


NON-BONDED CONTACTS REFINED 


ATOMS (A) 


517 


r 


0 


238 , 


0 
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NON-BONDED CONTACTS OTHERS 


(A) 


2522 
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NON-BONDED TORSION OTHERS 


(A) 


1336 
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0 


088 , 


0 


200 


H-BOND (X...Y) REFINED ATOMS 


(A) 


241 


f 


0 


163 , 


0 
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SYMMETRY VDW REFINED ATOMS 


(A) 


16 
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0 
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SYMMETRY VDW OTHERS 
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0 
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0 
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SYMMETRY H-BOND REFINED ATOMS (A) 


23 
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ISOTROPIC THERMAL FACTOR RESTRAINTS. 
MAIN-CHAIN BOND REFINED ATOMS (A**2) 
MAIN-CHAIN ANGLE REFINED ATOMS (A**2) 
SIDE-CHAIN BOND REFINED ATOMS (A**2) 
SIDE-CHAIN ANGLE REFINED ATOMS (A**2) 

ANISOTROPIC THERMAL FACTOR RESTRAINTS. 
RIGID-BOND RESTRAINTS (A**2) 
SPHERICITY; BONDED ATOMS (A**2) 

NCS RESTRAINTS STATISTICS 
NUMBER OF NCS GROUPS : NULL 



TLS DETAILS 
NUMBER OF TLS GROUPS : 1 

TLS GROUP : 1 

NUMBER OF COMPONENTS GROUP : 
COMPONENTS C SSSEQI TO 

RESIDUE RANGE : A 419 

ORIGIN FOR THE GROUP (A): 7.0590 
T TENSOR 

Til: 0.0106 T22: 0.0198 

T33: 0.0169 T12: -0.0142 

T13: -0.0005 T23: 0.0042 



COUNT 
1362 ; 
2217 ; 

916 ; 

862 ; 

COUNT 
2278 ; 
2226 ; 



RMS 
094 
859 
488 
822 



RMS 
1.321 
1.814 



WEIGHT 
; 1.500 

2.000 
; 3.000 
; 4.500 

WEIGHT 

r 2.000 

; 2.000 



C SSSEQI 
A 691 



1.6770 18.9230 



TENSOR 
Lll 
L33 
L13: 
TENSOR 
Sll 
S21 
S31 



0.7756 L22: 
0.5853 L12: 
0.1565 L23: 

0.0307 S12 
0.0204 S22 
0.0401 S32 



0.7085 
-0.2205 
-0.0117 

-0.0104 S13 
0.0478 S23 
0.0386 S33 



0.0730 
-0.0005 
-0.0171 



BULK SOLVENT MODELLING. 
METHOD USED : BAB I NET MODEL WITH MASK 
PARAMETERS FOR MASK CALCULATION 



VDW PROBE RADIUS 
ION PROBE RADIUS 
SHRINKAGE RADIUS 



40 
80 
80 



OTHER REFINEMENT REMARKS: 

HYDROGENS HAVE BEEN ADDED IN THE RIDING POSITIONS 



46.970 80.360 
0.026901 0.000000 0 
0.000000 0.021290 
0.000000 0.000000 
N MET A 419 
N MET A 419 
CA MET A 419 



90.00 92.63 
001235 
0.000000 
0.012457 
-17.798 13.824 
4698 4704 4686 2 
-17.141 14.629 25.645 1 
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1 21 1 



00 37.08 

-13 
00 36.94 



12 



A N 
A N 
A C 



DLMR250008.1 



Atty. Dkt. No.: 039363-1202 



AN I SOU 


3 


CA 


MET 


A 


419 


4672 4681 


4680 


-19 




-7 




-4 


A 


C 


ATOM 


5 


CB 


MET 


A 


419 


-18.186 15. 


.173 24 


.668 1 


.00 


37 


. 63 




A 


C 


ANISOU 


5 


CB 


MET 


A 


419 


4763 4778 


4757 


9 




-9 




24 


A 


c 


ATOM 


8 


CG 


MET 


A 


419 


-19.078 14. 


.098 24 


.049 1 


.00 


39 


.47 




A 


c 


ANISOU 


8 


CG 


MET 


A 


419 


4983 5017 


4994 


-61 




-50 




8 


A 


c 


ATOM 


11 


SD 


MET 


A 


419 


-18.149 12. 


.778 23 


.218 1 


.00 


42 


.55 




A 


S 


ANISOU 


11 


SD 


MET 


A 


419 


5414 5343 


5409 


11 




26 




-34 


A 


S 


ATOM 


12 


CE 


MET 


A 


419 


-17.963 11. 


.571 24, 


.548 1 


.00 


42 


.75 




A 


c 


ANISOU 


12 


CE 


MET 


A 


419 


5417 5401 


5423 


-17 




-19 




12 


A 


c 


ATOM 


16 


C 


MET 


A 


419 


-16.343 15. 


.776 26, 


.257 1 


. 00 


35 


.96 




A 


c 


ANISOU 


16 


c 


MET 


A 


419 


4538 4570 


4553 


-2 




21 




4 


A 


c 


ATOM 


17 


O 


MET 


A 


419 


-16.823 16, 


,469 27 


.161 1 


.00 


36 


.07 




A 


o 


ANISOU 


17 


0 


MET 


A 


419 


4561 4581 


4561 


-5 




15 




-19 


A 


o 


ATOM 


20 


N 


ILE 


A 


420 


-15.136 15, 


,980 25. 


.730 1 


.00 


34 


.59 




A 


N 


ANISOU 


20 


N 


ILE 


A 


420 


4374 4378 


4388 


7 




-11 
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REMARK 
REMARK 
HEADER 
COMPND 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
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XX-XXX-XX 



REFINEMENT. 
PROGRAM 
AUTHORS 



REFMAC 5.1.2 5 
MURSHUDOV, VAGIN, DODSON 



REFINEMENT TARGET 



MAXIMUM LIKELIHOOD 



DATA USED IN REFINEMENT. 
RESOLUTION RANGE HIGH (ANGSTROMS) 
RESOLUTION RANGE LOW (ANGSTROMS) 
DATA CUTOFF (SIGMA ( F) ) 

COMPLETENESS FOR RANGE (%) 
NUMBER OF REFLECTIONS 

FIT TO DATA USED IN REFINEMENT. 
CROSS-VALIDATION METHOD 
FREE R VALUE TEST SET SELECTION 
R VALUE (WORKING + TEST SET) 

R VALUE (WORKING SET) 

FREE R VALUE 

FREE R VALUE TEST SET SIZE (%) 
FREE R VALUE TEST SET COUNT 



.80 
.65 



1 . 
81. 
NONE 
99.77 
24820 



THROUGHOUT 
RANDOM 
0. 18829 

0. 18620 

0.22809 

5.1 

1327 



FIT IN THE HIGHEST RESOLUTION BIN. 

TOTAL NUMBER OF BINS USED 

BIN RESOLUTION RANGE HIGH 

BIN RESOLUTION RANGE LOW 
REFLECTION IN BIN (WORKING SET) 

BIN R VALUE (WORKING SET) 

BIN FREE R VALUE SET COUNT 
BIN FREE R VALUE 



NUMBER OF NON- HYDROGEN ATOMS USED IN REFINEMENT. 
ALL ATOMS : 2 507 





20 


1 


.800 


1 


.847 




1749 


0 


.242 




90 


0 


.288 



B VALUES. 
FROM WILSON PLOT (A**2) 
MEAN B VALUE (OVERALL, A** 2) 

OVERALL ANISOTROPIC B VALUE. 



NULL 
17.218 



Bll 
B22 
B33 
B12 
B13 
B23 



(A**2) 
(A**2) 
(A**2) 
(A**2) 
(A**2) 
(A**2) 



-0.09 
0.14 

-0.04 
0.00 

-0.02 
0.00 



ESTIMATED OVERALL COORDINATE ERROR. 
ESU BASED ON R VALUE 
ESU BASED ON FREE R VALUE 
ESU BASED ON MAXIMUM LIKELIHOOD 



(A) 
(A) 
(A) 



ESU FOR B VALUES BASED ON MAXIMUM LIKELIHOOD (A**2) 



0.141 
0. 133 
0.082 
2.620 



CORRELATION COEFFICIENTS. 

CORRELATION COEFFICIENT FO-FC 
CORRELATION COEFFICIENT FO-FC FREE 

RMS DEVIATIONS FROM IDEAL VALUES 
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0. 948 
0.929 

COUNT 



RMS 



WEIGHT 
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REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

CRYST1 

SCALE 1 

SCALE2 

SCALE3 

ATOM 

ATOM 

ATOM 

ATOM 

ATOM 



BOND LENGTHS REFINED ATOMS 


(A) 


2310 , 


0 


010 , 


0 


022 


BOND LENGTHS OTHERS 


(A) 


2097 , 


0 


002 , 


0 


020 


BOND ANGLES REFINED ATOMS 


(DEGREES) 


3134 , 


1 


372 , 


1 


981 


BOND ANGLES OTHERS 


(DEGREES) 


4890 , 


0 


790 , 


3 


000 


TORSION ANGLES, PERIOD 1 


(DEGREES) 


272 , 


5 


281 , 


5 


000 


CHIRAL-CENTER RESTRAINTS 


(A**3) 


344 , 


0 


076 , 


0 


200 


GENERAL PLANES REFINED ATOMS 


(A) 


2489 , 


0 


005 , 


0 


020 


GENERAL PLANES OTHERS 


(A) 


465 , 


0 


002 , 


0 


020 


NON-BONDED CONTACTS REFINED 


ATOMS (A) 


475 , 


0 


205 , 


0 


200 


NON-BONDED CONTACTS OTHERS 


(A) 


2364 , 


0 


223 , 


0 


200 


NON-BONDED TORSION OTHERS 


(A) 


1222 , 


0 


081 , 


0 


200 


H-BOND (X...Y) REFINED ATOMS 


(A) 


147 , 


0 


162 , 


0 


200 


SYMMETRY VDW REFINED ATOMS 


(A) 


21 , 


0 


168 , 


0 


200 


SYMMETRY VDW OTHERS 


(A) 


86 , 


0 


250 , 


0 


200 


SYMMETRY H-BOND REFINED ATOMS (A) 


: 14 , 


0 


111 , 


0 


200 



3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 

37.316 46.978 81.109 90.00 
0.026798 0.000000 0.001323 
0.021287 0.000000 
0.000000 0.012344 
-17.724 
-16.798 
-17.513 
-18.905 
-18.872 



ISOTROPIC THERMAL FACTOR RESTRAINTS. 

MAIN-CHAIN BOND REFINED ATOMS (A**2) 

MAIN-CHAIN ANGLE REFINED ATOMS (A**2) 

SIDE-CHAIN BOND REFINED ATOMS <A**2) 

SIDE-CHAIN ANGLE REFINED ATOMS <A**2) 

NCS RESTRAINTS STATISTICS 
NUMBER OF NCS GROUPS : NULL 



TLS DETAILS 
NUMBER OF TLS GROUPS : 

TLS GROUP : 1 

NUMBER OF COMPONENTS GROUP 
COMPONENTS C SSSEQI 

RESIDUE RANGE : A 419 

ORIGIN FOR THE GROUP (A) : 
T TENSOR 

0.0048 T22: 0.0352 

0.0580 T12: -0.0119 

-0.0081 T23: 0.0084 



COUNT 
1365 ; 
2224 ; 

945 ; 

910 ; 



RMS 
. 818 
.568 
.206 
. 668 



WEIGHT 

1.500 

2.000 

3.000 

4.500 



1 

TO C 
A 

5.9620 



SSSEQI 
691 

1.7680 



19. 1340 



Til 
T33 
T13 



TENSOR 
Lll 
L33 
L13 
TENSOR 
Sll 
S21 
S31 



3962 L22 
2902 L12 
0731 L23 



0.3784 
0.1647 
0.0592 



0145 S12 
0077 S22 
-0.0159 S32 



0246 S13 
0410 S23 
0355 S33 



0.0170 
0.0381 
-0.0265 



BULK SOLVENT MODELLING. 
METHOD USED : BABINET MODEL WITH MASK 
PARAMETERS FOR MASK CALCULATION 



VDW PROBE RADIUS 
ION PROBE RADIUS 
SHRINKAGE RADIUS 



40 
80 
80 



OTHER REFINEMENT REMARKS: 

HYDROGENS HAVE BEEN ADDED IN THE RIDING POSITIONS 



0.000000 
0.000000 



1 


N 


MET 


A 


419 


3 


CA 


MET 


A 


419 


5 


CB 


MET 


A 


419 


8 


CG 


MET 


A 


419 


11 


SD 


MET 


A 


419 



92.83 



15.274 
15.014 
15.303 
14. 692 
12.884 



90.00 P 
0.00000 
0.00000 
0.00000 
26.545 
25. 404 
24.075 
23. 955 
23.783 



1 21 1 



00 
00 
00 
00 
00 



41. 92 
41.87 
42.37 
44.21 
48. 64 



A 
A 
A 
A 
A 



N 
C 
C 
C 
S 
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ATOM 


12 


CE 


MET 


A 


419 


-19. 


036 


12. 


354 


25 


.521 


1 . 


00 


47 . 


82 


A 


C 


ATOM 


16 


C 


MET 


A 


419 


-15. 


527 


15. 


875 


25 


.505 


1 . 


00 


40. 


85 


A 


C 


ATOM 


17 


O 


MET 


A 


419 


-14. 


857 


16. 


115 


24 


. 495 


1 . 


00 


41. 


39 


A 


O 


ATOM 


20 


N 


ILE 


A 


420 


-15. 


200 


16. 


322 


26 


.719 


1. 


00 


39. 


21 


A 


N 


ATOM 


22 


CA 


ILE 


A 


420 


-14. 


050 


17. 


208 


26 


.982 


1. 


00 


37. 


79 


A 


c 


ATOM 


24 


CB 


ILE 


A 


420 


-12. 


697 


16. 


519 


26 


.689 


1 . 


00 


37 . 


82 


A 


c 


ATOM 


26 


CGI 


ILE 


A 


420 


-12. 


557 


15. 


240 


27 


.512 


1. 


00 


38. 


41 


A 


c 


ATOM 


29 


CD1 


ILE 


A 


420 


-11. 


209 


14 . 


556 


27 


.348 


1 . 


00 


38. 


81 


A 


c 


ATOM 


33 


CG2 


ILE 


A 


420 


-11. 


539 


17. 


494 


26 


.996 


1 . 


00 


37 . 


64 


A 


c 


ATOM 


37 


C 


ILE 


A 


420 


-14 . 


081 


18. 


526 


26 


.218 


1 . 


00 


36. 


12 


A 


c 


ATOM 


38 


O 


ILE 


A 


420 


-13. 


850 


18. 


561 


25 


.013 


1 . 


00 


36. 


25 


A 


o 


ATOM 


39 


N 


ALA 


A 


421 


-14 . 


316 


19. 


613 


26 


.935 


1. 


00 


34. 


13 


A 


N 


ATOM 


41 


CA 


ALA 


A 


421 


-14 . 


207 


20. 


936 


26 


.356 


1 . 


00 


32. 


57 


A 


c 


ATOM 


43 


CB 


ALA 


A 


421 


-15. 


126 


21. 


909 


27 


.076 


1 . 


00 


32. 


58 


A 


c 


ATOM 


47 


C 


ALA 


A 


421 


-12. 


762 


21 . 


394 


26 


.457 


1 . 


00 


31. 


11 


A 


c 


ATOM 


48 


O 


ALA 


A 


421 


-12. 


009 


20. 


935 


27 


.315 


1 . 


00 


30. 


48 


A 


O 


ATOM 


49 


N 


ARG 


A 


422 


-12. 


385 


22. 


305 


25 


.572 


1 . 


00 


29. 


35 


A 


N 


ATOM 


51 


CA 


ARG 


A 


422 


-11. 


069 


22. 


917 


25 


.610 


1 . 


00 


28. 


21 


A 


c 


ATOM 


53 


CB 


ARG 


A 


422 


-10. 


957 


23. 


974 


24 


.506 


1 . 


00 


28. 


08 


A 


c 


ATOM 


56 


CG 


ARG 


A 


422 


-9. 


542 


24 . 


501 


24 


.279 


1. 


00 


27. 


30 


A 


c 


ATOM 


59 


CD 


ARG 


A 


422 


-9. 


471 


25. 


640 


23 


.289 


1 . 


00 


25. 


94 


A 


c 


ATOM 


62 


NE 


ARG 


A 


422 


-10. 


069 


25. 


288 


22 


.005 


1 . 


00 


25. 


59 


A 


N 


ATOM 


64 


CZ 


ARG 


A 


422 


-9. 


474 


24. 


572 


21 


.057 


1 . 


00 


24 . 


52 


A 


c 


ATOM 


65 


NH1 


ARG 


A 


422 


-8. 


241 


24 . 


095 


21 


.225 


1 . 


00 


22. 


64 


A 


N 


ATOM 


68 


NH2 


ARG 


A 


422 


-10. 


124 


24. 


320 


19 


.932 


1 . 


00 


24 . 


29 


A 


N 


ATOM 


71 


C 


ARG 


A 


422 


-10. 


773 


23. 


535 


26 


.985 


1. 


00 


27. 


22 


A 


c 


ATOM 


72 


0 


ARG 


A 


422 


-9. 


632 


23. 


519 


27 


. 435 


1 . 


00 


26. 


74 


A 


O 


ATOM 


73 


N 


GLU 


A 


423 


-11. 


808 


24. 


051 


27 


. 652 


1 . 


00 


26. 


08 


A 


N 


ATOM 


75 


CA 


GLU 


A 


423 


-11. 


674 


24 . 


666 


28 


. 979 


1. 


00 


25. 


78 


A 


c 


ATOM 


77 


CB 


GLU 


A 


423 


-13. 


012 


25. 


237 


29 


.474 


1 . 


00 


26. 


29 


A 


c 


ATOM 


80 


CG 


GLU 


A 


423 


-13. 


662 


26. 


233 


28 


.552 


1 . 


00 


28. 


01 


A 


c 


ATOM 


83 


CD 


GLU 


A 


423 


-14 . 


629 


25. 


584 


27 


.583 


1 . 


00 


29. 


62 


A 


c 


ATOM 


84 


OE1 


GLU 


A 


423 


-14 . 


183 


25. 


287 


26 


.450 


1 . 


00 


28. 


40 


A 


0 


ATOM 


85 


OE2 


GLU 


A 


423 


-15. 


823 


25. 


382 


27 


.960 


1 . 


00 


30. 


82 


A 


O 


ATOM 


86 


C 


GLU 


A 


423 


-11. 


224 


23. 


675 


30 


.040 


1 . 


00 


24 . 


55 


A 


c 


ATOM 


87 


O 


GLU 


A 


423 


-10. 


636 


24. 


070 


31 


.034 


1 . 


00 


24 . 


25 


A 


0 


ATOM 


88 


N 


ASP 


A 


424 


-11. 


550 


22. 


401 


29 


.843 


1. 


00 


23. 


65 


A 


N 


ATOM 


90 


CA 


ASP 


A 


424 


-11. 


151 


21. 


351 


30 


.778 


1. 


00 


23. 


18 


A 


c 


ATOM 


92 


CB 


ASP 


A 


424 


-11. 


925 


20. 


056 


30 


.503 


1 . 


00 


23. 


13 


A 


c 


ATOM 


95 


CG 


ASP 


A 


424 


-13. 


436 


20. 


219 


30 


.670 


1 . 


00 


25. 


17 


A 


C 


ATOM 


96 


OD1 


ASP 


A 


424 


-13. 


848 


21. 


127 


31 


.427 


1. 


00 


26. 


20 


A 


o 


ATOM 


97 


OD2 


ASP 


A 


424 


-14 . 


276 


19. 


481 


30 


.095 


1. 


00 


26. 


16 


A 


O 


ATOM 


98 


C 


ASP 


A 


424 


-9. 


639 


21 . 


067 


30 


.742 


1 . 


00 


22 . 


42 


A 


C 


ATOM 


99 


0 


ASP 


A 


424 


-9. 


148 


20. 


360 


31 


. 606 


1 . 


00 


21 . 


99 


A 


O 


ATOM 


100 


N 


VAL 


A 


425 


-8 . 


920 


21 . 


606 


29 


.752 


1 . 


00 


21. 


44 


A 


N 


ATOM 


102 


CA 


VAL 


A 


425 


-7 . 


488 


21. 


342 


29 


.592 


1 . 


00 


21. 


23 


A 


C 


ATOM 


104 


CB 


VAL 


A 


425 


-7 . 


184 


20. 


639 


28 


.249 


1 . 


00 


21. 


09 


A 


C 


ATOM 


106 


CGI 


VAL 


A 


425 


-5. 


678 


20. 


393 


28 


.092 


1 . 


00 


21. 


37 


A 


C 


ATOM 


110 


CG2 


VAL 


A 


425 


-7. 


963 


19. 


337 


28 


. 133 


1 . 


00 


20. 


93 


A 


C 


ATOM 


114 


C 


VAL 


A 


425 


-6. 


715 


22. 


641 


29 


. 649 


1 . 


00 


21. 


24 


A 


C 


ATOM 


115 


O 


VAL 


A 


425 


-6. 


957 


23. 


541 


28 


.836 


1 . 


00 


21. 


03 


A 


O 


ATOM 


116 


N 


VAL 


A 


426 


-5. 


824 


22. 


742 


30 


. 631 


1 . 


00 


20. 


75 


A 


N 


ATOM 


118 


CA 


VAL 


A 


426 


-4. 


965 


23. 


894 


30 


.830 


1. 


00 


21. 


32 


A 


C 


ATOM 


120 


CB 


VAL 


A 


426 


-4 . 


992 


24. 


367 


32 


.300 


1. 


00 


21. 


30 


A 


c 


ATOM 


122 


CGI 


VAL 


A 


426 


-4 . 


044 


25. 


545 


32 


.514 


1 . 


00 


22. 


52 


A 


c 


ATOM 


126 


CG2 


VAL 


A 


426 


-6. 


415 


24. 


743 


32 


.718 


1. 


00 


21. 


78 


A 


c 


ATOM 


130 


C 


VAL 


A 


426 


-3. 


522 


23. 


530 


30 


.466 


1. 


00 


21. 


27 


A 


c 


ATOM 


131 


0 


VAL 


A 


426 


-2. 


931 


22. 


621 


31 


.046 


1. 


00 


20. 


65 


A 


O 


ATOM 


132 


N 


LEU 


A 


427 


-2. 


960 


24. 


253 


29 


.509 


1. 


00 


21. 


37 


A 


N 


ATOM 


134 


CA 


LEU 


A 


427 


-1. 


585 


24. 


024 


29 


.079 


1. 


00 


21. 


38 


A 


c 


ATOM 


136 


CB 


LEU 


A 


427 


-1. 


413 


24. 


387 


27 


.593 


1 . 


00 


21 . 


39 


A 


c 


ATOM 


139 


CG 


LEU 


A 


427 


-2 . 


428 


23. 


797 


26 


. 603 


1. 


00 


21 . 


09 


A 


c 


ATOM 


141 


CD1 


LEU 


A 


427 


-2 . 


214 


24. 


341 


25 


.191 


1 . 


00 


21 . 


00 


A 


c 


ATOM 


145 


CD2 


LEU 


A 


427 


-2 . 


399 


22 . 


267 


26 


.592 


1 . 


00 


20. 


75 


A 


c 


ATOM 


149 


C 


LEU 


A 


427 


-0. 


626 


24. 


841 


29 


.931 


1 . 


00 


22. 


15 


A 


c 


ATOM 


150 


O 


LEU 


A 


427 


-0. 


819 


26. 


043 


30 


.102 


1 . 


00 


21. 


62 


A 


O 


ATOM 


151 


N 


ASN 


A 


428 


0. 


413 


24. 


189 


30 


.448 


1 . 


00 


22. 


43 


A 


N 
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ATOM 


153 


CA 


ASN 


A 


428 


1. 


416 


24. 


834 


31. 


311 


1 . 


00 


23 


. 45 


A 


C 


ATOM 


155 


CB 


ASN 


A 


428 


1. 


710 


23. 


923 


32. 


508 


1. 


00 


23 


.76 


A 


C 


ATOM 


158 


CG 


ASN 


A 


428 


0. 


458 


23. 


579 


33. 


293 


1 . 


00 


26 


.13 


A 


C 


ATOM 


159 


OD1 


ASN 


A 


428 


0. 


301 


22. 


454 


33. 


774 


1. 


00 


30 


.61 


A 


O 


ATOM 


160 


ND2 


ASN 


A 


428 


-0. 


455 


24. 


536 


33. 


400 


1. 


00 


28 


.03 


A 


N 


ATOM 


163 


C 


ASN 


A 


428 


2. 


728 


25. 


192 


30. 


611 


1. 


00 


23 


.29 


A 


C 


ATOM 


164 


O 


ASN 


A 


428 


3. 


316 


26. 


231 


30. 


907 


1 . 


00 


23 


.11 


A 


O 


ATOM 


165 


N 


ARG 


A 


429 


3. 


217 


24. 


316 


29. 


732 


1. 


00 


23 


.23 


A 


N 


ATOM 


167 


CA 


ARG 


A 


429 


4 . 


438 


24 . 


593 


28. 


965 


1. 


00 


23 


.80 


A 


C 


ATOM 


169 


CB 


ARG 


A 


429 


5. 


678 


24 . 


419 


29. 


846 


1 . 


00 


24 


.47 


A 


C 


ATOM 


172 


CG 


ARG 


A 


429 


5. 


945 
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-1. 


759 


1 


.00 


40 


.11 


w 


o 


ATOM 


5035 


o 


HOH 


W 


179 


13. 


927 


-22. 


767 


20. 


107 


1 


.00 


44 


.51 


w 


o 


ATOM 


5038 


o 


HOH 


w 


180 


-5. 


833 


17. 


092 


42. 


432 


1 


.00 


39 


. 14 


w 


0 


ATOM 


5041 


0 


HOH 


w 


181 


4 . 


780 


28. 


120 


30. 


190 


1 


.00 


35 


.76 


w 


0 


ATOM 


5044 


0 


HOH 


w 


182 


5. 


232 


-21. 


397 


10. 


130 


1 


.00 


32 


.74 


w 


o 


ATOM 


5047 


o 


HOH 


w 


183 


-0. 


609 


-13. 


905 


29. 


169 


1 


.00 


42 


.77 


w 


o 


ATOM 


5050 


0 


HOH 


w 


184 


-8. 


615 


-12. 


423 


20. 


365 


1 


.00 


38 


.88 


w 


o 


ATOM 


5053 


0 


HOH 


w 


185 


-2. 


742 


24. 


869 


15. 


943 


1 


.00 


35 


. 44 


w 


o 


ATOM 


5056 


0 


HOH 


w 


186 


-6. 


914 


23. 


870 


13. 


421 


1. 


.00 


38 


.92 


w 


0 


ATOM 


5059 


0 


HOH 


w 


187 


-9. 


844 


5. 


148 


9. 


425 


1. 


.00 


47 


.06 


w 


0 


ATOM 


5062 


0 


HOH 


w 


188 


-3. 


377 


1. 


908 


31. 


500 


1. 


.00 


47 


. 62 


w 


o 


ATOM 


5065 


0 


HOH 


w 


189 


8 . 


535 


-2. 


717 


9. 


847 


1 


.00 


20 


.99 


w 


0 


ATOM 


5068 


0 


HOH 


w 


190 


-2 . 


120 


23. 


920 


11. 


635 


1 . 


.00 


33 


.73 


w 


0 


ATOM 


5071 


0 


HOH 


w 


191 


-1 . 


821 


12. 


406 


36. 


174 


1 , 


.00 


33 


.31 


w 


0 


ATOM 


5074 


o 


HOH 


w 


192 


15. 


430 


-22. 


456 


17. 


980 


1 , 


.00 


32 


. 67 


w 


0 


ATOM 


5077 


o 


HOH 


w 


193 


26. 


766 


-7 . 


447 


10. 


363 


1 , 


. 00 


37 


.80 


w 


0 


ATOM 


5080 


o 


HOH 


w 


194 


7 . 


871 


-12. 


672 


1. 


551 


1. 


.00 


33 


.24 


w 


o 


ATOM 


5083 


o 


HOH 


w 


195 


11 . 


658 


-13. 


419 


-1 . 


004 


1 , 


.00 


38 


.16 


w 


0 


ATOM 


5086 


o 


HOH 


w 


196 


16. 


826 


1. 


526 


0. 


614 


1. 


.00 


50 


.30 


w 


0 


ATOM 


5089 


0 


HOH 


w 


197 


15. 


595 


6. 


152 


14. 


916 


1. 


.00 


40 


.35 


w 


o 


ATOM 


5092 


o 


HOH 


w 


198 


-1. 


881 


25. 


715 


18. 


176 


1. 


.00 


37 


.84 


w 


o 


ATOM 


5095 


o 


HOH 


w 


199 


-11. 


651 


11. 


014 


33. 


297 


1. 


.00 


32 


.46 


w 


o 


ATOM 


5098 


0 


HOH 


w 


200 


18. 


893 


-2. 


239 


0. 


202 


1. 


.00 


36 


.54 


w 


o 


ATOM 


5101 


o 


HOH 


w 


201 


8. 


083 


-15. 


775 


41. 


296 


1, 


.00 


35 


.79 


w 


0 


ATOM 


5104 


0 


HOH 


w 


202 


27. 


247 


-8. 


234 


2. 


554 


1 , 


.00 


45 


.23 


w 


o 


ATOM 


5107 


o 


HOH 


w 


203 


-1. 


222 


-15. 


328 


27. 


055 


1. 


.00 


46 


.27 


w 


o 


ATOM 


5110 


o 


HOH 


w 


204 


13. 


756 


-2. 


002 


41. 


227 


1 , 


.00 


44 


. 19 


w 


0 


ATOM 


5113 


o 


HOH 


w 


205 


18. 


212 


11. 


234 


28. 


397 


1 , 


.00 


46 


.27 


w 


o 


ATOM 


5116 


o 


HOH 


w 


206 


10. 


446 


24. 


528 


22. 


495 


1 . 


.00 


29 


.18 


w 


o 


ATOM 


5119 


0 


HOH 


w 


207 


9. 


812 


7. 


702 


16. 


580 


1 . 


.00 


47 


.51 


w 


o 


ATOM 


5122 


o 


HOH 


w 


208 


8. 


614 


13. 


189 


26. 


746 


1 . 


.00 


42 


.96 


w 


o 


ATOM 


5125 


0 


HOH 


w 


209 


26. 


242 


4 . 


872 


13. 


201 


1. 


.00 


34 


.36 


w 


0 


ATOM 


5128 


0 


HOH 


w 


210 


3. 


417 


32. 


021 


7. 


569 


1 . 


.00 


43 


.78 


w 


0 


ATOM 


5131 


0 


HOH 


w 


211 


13. 


082 


12. 


607 


-0. 


030 


1 . 


.00 


38 


.29 


w 


0 


ATOM 


5134 


0 


HOH 


w 


212 


11. 


113 


-16. 


047 


1. 


534 


1 . 


,00 


48 


.53 


w 


0 


ATOM 


5137 


0 


HOH 


w 


213 


16. 


799 


-19. 


980 


17. 


140 


1. 


.00 


42 


.01 


w 


0 


ATOM 


5140 


o 


HOH 


w 


214 


8. 


100 


2 . 


699 


9. 


516 


1. 


.00 


38 


. 54 


w 


o 


ATOM 


5143 


o 


HOH 


w 


215 


21. 


192 


-10. 


596 


35. 


721 


1 . 


.00 


42 


.09 


w 


o 


ATOM 


5146 


o 


HOH 


w 


216 


-12. 


311 


27. 


355 


25. 


513 


1 . 


.00 


46 


.88 


w 


o 


ATOM 


5149 


o 


HOH 


w 


217 


2. 


196 


18. 


838 


8. 


312 


1. 


.00 


40 


.97 


w 


o 


ATOM 


5152 


o 


HOH 


w 


218 


2 . 


760 


7. 


802 


34. 


907 


1 . 


.00 


40 


.65 


w 


0 


ATOM 


5155 


o 


HOH 


w 


219 


2. 


052 


34. 


572 


8. 


653 


1 . 


.00 


28 


. 96 


w 


o 


ATOM 


5158 


0 


HOH 


w 


220 


20. 


199 


10. 


058 


26. 


993 


1 . 


,00 


39 


.00 


w 


o 


ATOM 


5161 


0 


HOH 


w 


221 


0. 


666 


6. 


917 


33. 


693 


1 . 


,00 


41 


.05 


w 


o 


ATOM 


5164 


0 


HOH 


w 


222 


19. 


656 


-20. 


516 


17. 


448 


1 . 


.00 


49 


.29 


w 


o 


ATOM 


5167 


o 


HOH 


w 


223 


24. 


529 


-13. 


997 


28. 


898 


1 . 


.00 


44 


.80 


w 


0 


ATOM 


5170 


o 


HOH 


w 


224 


-15. 


502 


8. 


924 


35. 


4 63 


1. 


,00 


46 


.22 


w 


0 


ATOM 


5173 


o 


HOH 


w 


225 


6. 


321 


6. 


726 


39. 


047 


1. 


00 


40 


.39 


w 


0 


ATOM 


5176 


o 


HOH 


w 


226 


13. 


346 


-19. 


882 


30. 


564 


1. 


00 


27 


. 35 


w 


o 


ATOM 


5179 


o 


HOH 


w 


227 


2. 


475 


8. 


509 


20. 


962 


1. 


,00 


37 


.71 


w 


o 


ATOM 


5182 


o 


HOH 


w 


228 


25. 


697 


-17. 


409 


7 . 


650 


1. 


00 


40 


.93 


w 


o 
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ATOM 


5185 


O 


HOH 


W 


229 


-5. 


326 


ATOM 


5188 


0 


HOH 


w 


230 


18. 


689 


ATOM 


5191 


O 


HOH 


w 


231 


22. 


256 


ATOM 


5194 


O 


HOH 


w 


232 


6. 


671 


ATOM 


5197 


0 


HOH 


w 


233 


-12. 


901 


ATOM 


5200 


0 


HOH 


w 


234 


11. 


146 


ATOM 


5203 


O 


HOH 


w 


235 


25. 


034 


ATOM 


5206 


O 


HOH 


w 


236 


-14 . 


460 


ATOM 


5209 


O 


HOH 


w 


237 


-12. 


580 


ATOM 


5212 


O 


HOH 


w 


238 


-15. 


352 


ATOM 


5215 


O 


HOH 


w 


239 


-1. 


668 


ATOM 


5218 


O 


HOH 


w 


240 


8. 


754 


ATOM 


5221 


O 


HOH 


w 


241 


-1 . 


876 


ATOM 


5224 


O 


HOH 


w 


242 


19. 


512 


ATOM 


5227 


O 


HOH 


w 


243 


-8. 


077 


ATOM 


5230 


0 


HOH 


w 


244 


11 . 


861 


ATOM 


5233 


0 


HOH 


w 


245 


0. 


301 


ATOM 


5236 


O 


HOH 


w 


246 


-2. 


640 


ATOM 


5239 


O 


HOH 


w 


247 


12. 


588 


ATOM 


5242 


O 


HOH 


w 


248 


11. 


738 


ATOM 


5245 


O 


HOH 


w 


249 


17. 


071 


ATOM 


5248 


O 


HOH 


w 


250 


-2. 


350 


ATOM 


5251 


O 


HOH 


w 


251 


12. 


306 


END 

















22. 


902 


36. 


.076 


1. 


00 


34 . 


.85 


W 


0 


1. 


894 


1. 


.969 


1 . 


00 


40. 


.05 


W 


0 


9. 


449 


29. 


.382 


1. 


00 


38, 


.70 


w 


0 


16. 


029 


32. 


.045 


1. 


00 


45. 


.79 


w 


o 


11. 


354 


24 . 


.852 


1. 


00 


42. 


,77 


w 


o 


-21. 


564 


4 . 


.017 


1. 


00 


44 . 


.94 


w 


0 


-1. 


313 


25. 


.290 


1. 


00 


35. 


.63 


w 


o 


18. 


497 


19. 


.730 


1 . 


00 


43. 


.85 


w 


0 


11. 


671 


30. 


.829 


1. 


00 


49. 


.50 


w 


0 


9. 


953 


23. 


.232 


1 . 


00 


41. 


.57 


w 


0 


11. 


121 


33, 


.395 


1. 


00 


56. 


.23 


w 


0 


-13. 


126 


-3. 


.263 


1 . 


00 


52. 


,79 


w 


o 


-3. 


651 


6. 


.368 


1 . 


00 


34. 


.87 


w 


0 


-18. 


187 


6. 


.893 


1 . 


00 


42. 


.20 


w 


0 


30. 


761 


18, 


.011 


1 . 


00 


33. 


.22 


w 


0 


14. 


174 


19. 


.992 


1 . 


00 


49. 


.09 


w 


0 


11. 


850 


-6. 


.763 


1 . 


00 


46, 


.36 


w 


o 


12. 


555 


0. 


.451 


1. 


00 


48. 


.51 


w 


o 


-19. 


822 


12. 


,376 


1 . 


00 


48. 


,93 


w 


o 


20. 


944 


27. 


.362 


1 . 


00 


51 . 


, 67 


w 


o 


8. 


568 


19. 


, 617 


1 . 


00 


43. 


62 


w 


0 


-17. 


778 


11. 


,732 


1 . 


00 


51. 


, 64 


w 


0 


-4. 


700 


40. 


,223 


1 . 


00 


48. 


,28 


w 


o 
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Table 3 



Alignment of PYK2 with other tyrosine kinase structures 



Si? 
886*5 

CSK.lbyo 

TEK~1rvr 

KDR1vr2 

FGFRU2fgi 

IHSR 1tr3 

IGFlR.lkia 

EPHB2_1jpa 

EGFR_1ni17 



k i a RK HyvLKR-i-bam p P OKVYBavTreb ii 

EIQitg EIHLQHCIOEa O FaDVHOgrYH BPE 
SI PRE 3L;BL;gVKfcC^ CF pEVVfMgT1 



Otte-I-WVAVKTCflmCTlLDaKBKFkBK^VlMlcrtLD 
HPfcL&Val tTCKCCTBDSVtBK.yt.OEACHYTSLHi 



.HPHI VIET 



EI PRE£ Li KIrEKK licgco F 3EVWMAT 
EVPP.E" L; KL^VKRlib AOfc r flEVKMa YV 

embrti tTMKHKtLQao or qkitfYKav.i 

AL.HHKE U It IM^3T.l"q^ 3B Pt fcvM LOU' 

ErPK.D ni:KUaKPUqRQ AF i3QVlRADa: 
H.LPE.DF LVL 3KrLgH 3A F FSTTVLAg * T)3L DK P K PH: 

SIDISC /.glEPV.JOA 3BPOl5vC3QHLlgI.PQ . . . . Kgqi ^VAIf 



1 LI BTfelMKViigsQA P< 



_ __ ._ ___5fc a e . . ptH 

nP fcLAVAl KTC KIICTBDS VE BKPt.QBACHYTS l.HllggCRY I5DFBVDftCPDPBllAH LTMKQFDHPH IV I LI qvITBB . . FVW 

■ TTTR VAI XTI^PO. . TXSpBAFLQHAPVHnKLR HBK LV CCYA^yfe BE . . Pll? 

. h r&VAVCTfr r.PO. . BMg pBAFUABAMVfrlJCT Lq HDKU'lSSwrSE PI £ 

■ t r&VAVKSlfe Qg. . BK g fPDAPL. AH ANLMF3 3LQ HQH LVP !AAW roE . . PI K 

T& ^TtVAV.K'a f.KBO . . TM E jvBByL.KHAAVt.l KB I E HPHL.VC. LUGVC TBE.. PPF K 

■ - ^ IK ' VAV ?M lg "- " ■ P&T ftPAPLJttmgvfr TO. LB HSHLVQ Limi /HEKOQ1 ? 

. < jLR HS ZS*1 ■ C R H H g yab kd d hTJB P a.Jk LB, V I}: k l, o hhpiji it; LL^lCTTtfr . c ftTy 

"•'BOAT ^S BH RAUMBB LK I Lfl H I| 3 HHLD VVU LUQ«irKP03PlI? 

r ^sD at roSSgrrggggOT gn | a khem i i u LL&lt b <2 d . opife 

DBS AS uttBRt BFLMKAHyMrt aPT CHH V"V F. LliHV,^ 8B. QFTfT: 

'HBAAB tHUfcl HFLMBAri VM lfEPH CHHVVF. LLSV^P CO . QFflT 

KSOYT SpRSOfLSKAa I t"|3Q FD HPO VI H LKaVAfT K £ . TPvR 

PHQH. . . . KVPI'P-VAIKF^RB ATSI PK AMKB TI.0BAY VMAgR TD MPH VCF. LTdT^Jr S . . iftfO 



tEJ>.BTB.VAVK-t1 



PI!? 

SRC_2src 
HCK_1qcf 
LCK_1qpe 
ABL1 V&p 
CSKjlbyg 
TEK.Ifvr 
KDR 1vr2 
FCFR1 .2fgi 
lHSR_1ir3 
IGF1R 1k3a 
EPHB2 1jpa 
EGFR 1ra17 



mm 



DL,A.^L, IL.YAYQI.fc.TA.L.AYL.K tig.R rVHRD I AAEBVl ViJ S UD WET/a I 

plfrgBTGEY I. I ±. PQLVPMaAO E&sdMAYVBFJ oi TVHBDLRAAK ti liVd BHL lggKl? 

FLKl S aHOSK Q pL PKt.3 PP.SAQ I AEOMftP t E<JR M Y I B RP L.RAAJI I tl\j d ±3 L K I 

IBKOa pTvDTpKT pa03K L TP UkLLPM AAQ I AEdMAfrrE E^Kf75)i RPLR AAK FlV 3DTLE ISl" 



MS KG s lL LP F Lhtq ETGKY L 
HFHAEOS LLP" * " J ^ """^ ' - 




LVDHL.kllP6gip L . . HIKKLUiMAAg 

iI . MTYGH L, LP Y LBFlCH ftO BV { jKwt.l/rNATQ I-BS^»„ , . ; .j 

BYT4AKQS LVDYL>tSB<laS Vp, . ^bcLLnf'sl;DVckA^ltsVE:B< 

^feYAPHOS LL,DPLH KBB r/LKlt : jPAPAI At tSTAaTLE fel?PLUiPAADVARQMDY ' ~ 



HriHRPLAARK l5ljvi 3BllHl 7SX m VDFQLSKLWTO . 



HETVP ^KiTAT]gbl.T KD Pl5 
A HE PPOLE iffiTH P3HEIPEEOLB 

JLRPCAB HHPGHPPPT 

SLEPEXB nBPVL.APPd 

QMpoor 

mm 



HROLAAB] 



it 1 ! cYsRo^Xk'd^B^LA ^ .^ _„ 

KDLVaCAYgVARQHEYLAEE KCIHUgLAAaH VliV rEDKV «I 
QZXIQMAAat APOMAYLHi^ K^KDLAAR}; 2MV AilBFT /K I 

8tot j^HAo*B rn^tiicFv^itDLAARt; 



(DP8LSETIED. . . . KDYYKASVTELP I KWXEJ 
raOFaLSBTKED. . . . STYYK ASIOtLPIKVfKA 
*OFQLA BL> I RX> . . . .6J57Ttj>caAKFPIKMTAg5r 
*DPaLARVIED. . . . pJHYT CJtBQAXFPt KKTA PB? 

w>FQi.A|agr ex> . . . . Hi yTfut B OAKP ^rBtrrA FSa 

VDPQLSSLKTO .... DT ' YT AHUOU KFt P I KWTA PKS 

3DPQLTIHAS BTCDTOELpyEWTA ; 5X 

U3Paij3Tt3Q dVYVKKTMO EL PVK tfX A IBS 

rDPGLARD I YKD . . . PDYVRKGDABL.PL.KtfX A PU?, 
*DFGLARDIHHI . . . DTYKKTTHQRL.FVKtOtA PEA 
JPFOWTtHai YE,T . . . DYpRbaGEdEDpyBIOt A PES 

30FGM11RME YE.T . . . DYI?^KQC 3JCOlTu PVB OT<£ FWS 

... SVU\ !ET 3DPaL3k?LEDDTSDPTYrsAl,Bu?IP I RWTA FgT 

pJ^RDI^A^g{7!r^TPQI #?Klh T>FgLAlic^gABB . . KEfTTT^S . QGKVP I KlOt A fCSs' 



dlaarj; riTv abdft ^kT 

DLAAHli rC? f i ~™~* 



PYK2 
SRC_2src 

ABL1 liep 
CSK.lbyg 
TEK Ifvr 
KDR 1vr2 
FGFRl_2fgi 




Ky&iL HHovKproav 

L.TE.t|r TKGBVP i YPGM. VpiBHVUD( 



fTYGRIP.YPQH . 
fTHGEtPYPOM. 



. qUKDVI gVLBKt JDSLPKFDLCF tPVL.VTLMTR' 

htfv 



YHKPRPBnCr "^ t ' y t< ^-[^y< WKHBPEBRPT FBYIQSVL FYTATEBQY BE I P 



■gYRxvRPDBC p ggrrgrgg : 

YSKBRPEGCE KKVYEIMRAC »OWU 
'KQYKWJAFDOCF ^'»«»»»--- 



f^ I rMSDVtfSYOVL.L.WEX^fe L,qOTPTCaM . T >ZABI.YE^Soa YRL.BKPij<Ci: 3 EVYDLblHQC WEH 



roa&VtfgFOVLLWET^ LOABPTPavKt pB^CRfa.L^ JTBMBA.PDYTT 



RTYT K03DV<faFOVL.t,WET7Fl 




LaOBPTPOV . F IfBgLnCLUl EaHaXOKPaHCT hhl.y>ckm'l _ 
tLAECPTOQL . £ MBOVLKrVT DGGYLDOPDWCF SRVTDliMRM^ 
LAKQPYPOL. £ BKQVLHrVli gflaLLDKPDgCF aW£,£;E.LMIL t5"^ _ 
YGKBP YtfOM . 1 HQPVatfAST 3D YBL.P PpMDCF 3 AliHQl.Ht.PC »0! 
PGSKP,YD0I . FUsBrasl L^OBRLPQPPICT ffD^WIMVK^ gW 



pbkyotmTpc »Haa 



F 4FTBLVcaL3DVYQKSKDt IAM 
R>TBLKApI.a TILkBggAgOEBBKBKEBR 
HFHYt.OAFCgDYFTBTRPOYQ 



HHl" SHLDAAMEPi 



caQFHPKMRF _ 
!c »OYHPKMRP£gLH' 
KDRWHRPFFG - 
I DAD3RPF 



PEHRPT jgHYiiQAFLtfl DYFTBTRPQYOPQBMtt . . 
FHY I QS vTdDFYTAT E 

fed 5tF ^tYl.flJVLgfi{!WTJ!4 r . 



K PYBRF5 FAQt I»vai«MRHL.^PBK TY VUTTLYEKFT 

B PSQRFT rBHL VEttLG fat; Lp A~5K 

AVP30RPT FK0l4 VEt>U>RrVAf 
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Table 4 



PYK2 in pET15S 



U33284 ; Human protein tyrosine kinase PYK2 mRNA, complete cds 

Full-length protein in pET15S: 293 aa (SEQ ID NO: 2) Mass: 33872.2 pi: 6.07 

PYK2 kinase domain I420-M691 (not including first 21 aa in following sequence) SEQ ID 
NO: 1 

1 MGSSHHHHHH SSGLVPRGSH MIAREDWLN RILGEGFFGE VYEGVYTNHK GEKINVAVKT 

61 CKKDCTLDNK EKFMSEAVIM KNLDHPHIVK LIGIIEEEPT WIIMELYPYG ELGHYLERNK 

121 NSLKVLTLVL YSLQICKAMA YLESINCVHR DIAVRNILVA SPECVKLGDF GLSRYIEDED 

181 YYKASVTRLP IKWMSPESIN FRRFTTASDV WMFAVCMWEI LSFGKQPFFW LENKDVIGVL 

241 EKGDRLPKPD LCPPVLYTLM TRCWDYDPSD RPRFTELVCS LSDVYQMEKD I AM 



SEQ ID NO: 5 

PYK2-C1; 5'-TCCACAG CATATG ATTGCCCGTGAAGATGTGGT-3 ' 33 mer 

SEQ ID NO: 6 

PYK2-N2; TGGAGAAGGACATTGCCATG TAG GTCGAC GAGAG (Origin)' 

5'-CTCTC GTCGAC CTA CATGGCAATGTCCTTCTCCA-3 ' 34 mer 

pETISS sequence PCR product; 843 bp (SEQ ID NO: 4) 

Sequence encoding PYK2 kinase domain (in small letters below) (SEQ ID NO: 3) 

TCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGA 

TATACCATGGGCAGCAGCCATCATCATCATCATCACAGCAGCGGCCTGGTGCCGCGCGGCAGCCATATG 
attgcc cgtgaagatg 

1381 tggtcctgaa tcgtattctt ggggaaggct tttttgggga ggtctatgaa ggtgtctaca 
1441 caaatcacaa aggggagaaa atcaatgtag ctgtcaagac ctgcaagaaa gactgcactc 
1501 tggacaacaa ggagaagttc atgagcgagg cagtgatcat gaagaacctc gaccacccgc 
1561 acatcgtgaa gctgatcggc atcattgaag aggagcccac ctggatcatc atggaattgt 
1621 atccctatgg ggagctgggc cactacctgg agcggaacaa gaactccctg aaggtgctca 
1681 ccctcgtgct gtactcactg cagatatgca aagccatggc ctacctggag agcatcaact 
1741 gcgtgcacag ggacattgct gtccggaaca tcctggtggc ctcccctgag tgtgtgaagc 
1801 tgggggactt tggtctttcc cggtacattg aggacgagga ctattacaaa gcctctgtga 
1861 ctcgtctccc catcaaatgg atgtccccag agtccattaa cttccgacgc ttcacgacag 
1921 ccagtgacgt ctggatgttc gccgtgtgca tgtgggagat cctgagcttt gggaagcagc 
1981 ccttcttctg gctggagaac aaggatgtca tcggggtgct ggagaaagga gaccggctgc 
2041 ccaagcctga tctctgtcca ccggtccttt ataccctcat gacccgctgc tgggactacg 
2101 accccagtga ccggccccgc ttcaccgagc tggtgtgcag cctcagtgac gtttatcaga 
2161 tggagaagga cattgccatg 

TAGGTCGACTAGAGCCTGCAGTCrCGACCATCATCATCATCATCATTAATAAAAGGGCG 
AATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGG 
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Table 5: Pyk2 Activity and the Inhibition by ATP Analogs 



Pyk2 


Vmax 


Vmax (SE) 


K 


K(SE) 


K (Lo 95%) 


K (Up 95%) 


Equation 




8ng/well 


1.25e+4 


9.11e+2 


7.37e+0 


2.79e+0 


3.27e+0 


1.66e+1 


i = (Vmax * x) / (K + x) 




















Compounds 


Vmax 


K 


K(SE) 


K (Lo 95%) 


K (Up 95%) 


Y2 


n 


Equation 


Adenosine 


1.82e+4 


2.54e+2 


2.65e+2 


2.47e+1 


2.60e+3 


7.33e+2 


-5.14e-1 


y = ((Vmax * x A n) / (K A n + x A n)) + Y2 


AMP 


1.82e+4 


8.02e+1 


3.76e+1 


2.82e+1 


2.28e+2 


7.33e+2 


-5.05e-1 


y = ((Vmax * x A n) / (K A n + x A n)) + Y2 


ADT 


1.82e+4 


1.49e+1 


2.69e+0 


9.93e+0 


2.22e+1 


7.33e+2 


-7.69e-1 


y = ((Vmax * x A n) / (K A n + x A n)) + Y2 


AMPPCP 


1.82e+4 


7.69e+3 


1.99e+4 


2.43e+1 


2.44e+6 


7.33e+2 


-2.03e-1 


y = ((Vmax * x A n) / (K A n + x A n)) + Y2 


AMPPNP 


1.82e+4 


1.81e+1 


2.82e+0 


1.28e+1 


2.56e+1 


7.33e+2 


-7.18e-1 


y = ((Vmax * x A n) / (K A n + x A n)) + Y2 


ATP-g-S 


1.82e+4 


1.36e+1 


1.49e+0 


1.06e+1 


1.73e+1 


7.33e+2 


-9.66e-1 


y = ((Vmax * x A n) / (K A n + x A n)) + Y2 
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